
APPLICATION 
Docket Number: PATH03-14 



TITLE OF THE INVENTION: 

25 

NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO M. CATARRHALIS 
FOR DIAGNOSTICS AND THERAPEUTICS 

APPLICANTS: Gary L.Breton 

30 

RELATED APPLICATIONS: 

This application claims the benefit of U.S. Provisional Application Serial Number 
60/128,476, filed April.9, 1999, the entire teachings of which are incorporated herein by 
reference. 



» 

I 



Applicant's Docket No.: PATH(b-14 



BACKGROUND OF THE INVENTION 

The genus Moraxella is a member of the family Neisseriaceae. The 10 species of this 
genus are separated into 2 subgenera, Moraxella (rods) and Branhamella (cocci). Moraxella 
are gram-negative, aerobic, oxidase-positive, and usually catalase-postive. (Bovre, K. 1984. 
5 Genus E. Moraxella Lwoff 1939, 173 emend. Henriksen and Bovre 1968, 391, 105. Krieg 
and Holt (editors) In Bergey's Manual of Systematic Bacteriology, 1:296-303.). Moraxella 
catarrhalis, a member of the subgenera Branhamella, was previously called Branhamella 
catarrhalis and Neisseria catarrhalis. 

Moraxella catarrhalis is frequently isolated from the nasal cavity of humans, and 
1 0 until recently, was considered a nonpathogenic commensal of the upper respiratory tract. 
Currently it is most important lower respiratory pathogen after S.pneumoniae and H. 
influenzae (Doren, G., et al, 1986. Diagn. Microbiol. Infect. Dis. 4:191-201.). It is a 
common cause of otitis media in children, acute bronchitis or pneumonia in adults, and 
sinusitis (Wood, G, etal, 1996. Clin. Infect. Dis. 22:632-636.). Bacteremia, meningitis, 
1 5 skeletal infections and endocarditis due to M. catarrhalis are rare, but are observed in 
immunocompromised individuals (Aebi, C, et al, 1998. Infect. Immun. 66:540-548.). 
Concern for M. catarrhalis infections of cystic fibrosis (CF) patients is growing. Damage to 
the respiratory tract by M. catarrhalis could promote invasion by other pathogens such as P. 
aeruginosa in CF patients. (Deneuville, E., etal, 1995. ACTA Paediatr. 84:1212.). M. 
20 catarrhalis is also associated with acute laryngitis. In one study, 50% of patients with acute 
laryngitis were colonized with M. catarrhalis (Hoi, C, et al, 1996. Journal of Infectious 
Diseases. 174:636-638.), while isolates from healthy adults occur at the rate of 6% -1 1%. 
The colonization rates of children can be much higher, with average rates of 30%-35% 
(Sehgal, SC. et al, 1994. Infection 22:193-196.). In some hospitals, M. catarrhalis accounts 
25 for half of all the respiratory infections (Bluesone, C, et al, 1 992. Pediatr. Infect. Dis. J. 
11:S7-S11.). 

Increasing levels of antibiotic resistance have been observed in clinical isolates of M. 
catarrhalis recently. Before 1980, less than 10% of M. catarrhalis isolates were B- 
lactamase-positive. Currently, most clinical isolates produce fl-lactamase, making them 
30 resistant to B-lactam antibiotics such as penicillin. (Doern, G, et al, 1996. Antimicob. 
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Agents Chemother. 40:2884-2886.). M. catarrhalis is intrinsically resistant to a small group 
of drugs that include vancomycin and trimethoprim (Wallace, RJ. 1990. Am. J. Med. 
88:46S-50S), and is becoming increasingly resistant to sulfamethoxazole, oral 
cephalosporins, and macrolides (Hoppe, HL. 1998. Am J. Health. Syst. Pharm. 55:1881-97). 

Although, M. catarrhalis was once considered only as part of the nonpathogenic flora 
of the upper respiratory tract, it is emerging as an important respiratory pathogen. Currently, 
it is the third leading cause of lower respiratory tract infections and otitis media. Sequencing 
and further analysis of this genome will aid in identification of essential genes for 
development of drug targets, and reduce the health threat this organism poses. 



SUMMARY OF THE INVENTION 

The present invention fulfills the need for diagnostic tools and therapeutics by 
providing bacterial-specific compositions and methods for detecting Moraxella species 
including M. catarrhalis, as well as compositions and methods useful for treating and 
1 5 preventing Moraxella infection, in particular, M. catarrhalis infection, in vertebrates 
including mammals. 

The present invention encompasses isolated nucleic acids and polypeptides derived 
from M. catarrhalis that are useful as reagents for diagnosis of bacterial disease, components 
of effective antibacterial vaccines, and/or as targets for antibacterial drugs including anti-M 

20 catarrhalis drugs. They can also be used to detect the presence of M. catarrhalis and other 
Moraxella species in a sample; and in screening compounds for the ability to interfere with 
the M. catarrhalis life cycle or to inhibit M. catarrhalis infection. They also have use as 
biocontrol agents for plants. 

In one aspect, the invention features compositions of nucleic acids corresponding to 

25 entire coding sequences of M. catarrhalis proteins (SEQ ID NO: 1 - SEQ ID NO: 1 920) , 
including surface or secreted proteins or parts thereof, nucleic acids capable of binding 
mRNA from M. catarrhalis proteins to block protein translation, and methods for producing 
M. catarrhalis proteins or parts thereof using peptide synthesis and recombinant DNA 
techniques. This invention also features antibodies and nucleic acids useful as probes to 
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detect M. catarrhalis infection. In addition, vaccine compositions and methods for the 
protection or treatment of infection by M. catarrhalis are within the scope of this invention. 

The nucleotide sequences provided in SEQ ID NO: 1 - SEQ ID NO: 1920, a fragment 
thereof, or a nucleotide sequence at least about 99.5% identical to a sequence contained 
5 within SEQ ID NO: 1 - SEQ ID NO: 1920 may be "provided" in a variety of medias to 
facilitate use thereof. As used herein, "provided" refers to a manufacture, other than an 
isolated nucleic acid molecule, which contains a nucleotide sequence of the present 
invention, i.e., the nucleotide sequence provided in SEQ ID NO: 1 - SEQ ID NO: 1920, a 
fragment thereof, or a nucleotide sequence at least about 99.5% identical to a sequence 

10 contained within SEQ ID NO: 1 - SEQ ID NO: 1920. Uses for and methods for providing 
nucleotide sequences in a variety of media is well known in the art (see e.g., EPO 
Publication No. EP 0 756 006). 

In one application of this embodiment, a nucleotide sequence of the present invention 
can be recorded on computer readable media. As used herein, "computer readable media" 

15 refers to any media which can be read and accessed directly by a computer. Such media 
include, but are not limited to: magnetic storage media, such as floppy discs, hard disc 
storage media, and magnetic tape; optical storage media such as CD-ROM; electrical storage 
media such as RAM and ROM; and hybrids of these categories such as magnetic/optical 
storage media. A person skilled in the art can readily appreciate how any of the presently 

20 known computer readable media can be used to create a manufacture comprising computer 
readable media having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer 
readable media. A person skilled in the art can readily adopt any of the presently known 
methods for recording information on computer readable media to generate manufactures 

25 comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a person skilled in the art for 
creating a computer readable media having recorded thereon a nucleotide sequence of the 
present invention. The choice of the data storage structure will generally be based on the 
means chosen to access the stored information. In addition, a variety of data processor 

30 programs and formats can be used to store the nucleotide sequence information of the 
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present invention on computer readable media. The sequence information can be 
represented in a word processing text file, formatted in commercially-available software 
such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A person skilled in the 
5 art can readily adapt any number of data processor structuring formats (e.g. text file or 

database) in order to obtain computer readable media having recorded thereon the nucleotide 
sequence information of the present invention. 

• By providing the nucleotide sequence of SEQ ID NO: 1 - SEQ ID NO: 1920, a 
fragment thereof, or a nucleotide sequence at least about 99.5% identical to SEQ ID NO: 1 - 
1 0 SEQ ID NO: 1 920 in computer readable form, a person skilled in the art can routinely access 
the coding sequence information for a variety of purposes. Computer software is publicly 
available which allows a person skilled in the art to access sequence information provided in 
a computer readable media. Examples of such computer software include programs of the 
"Staden Package", "DNA Star", "MacVector", GCG "Wisconsin Package" (Genetics 
1 5 Computer Group, Madison, WI) and "NCBI Toolbox" (National Center For Biotechnology 
Information). Suitable programs are described, for example, in Martin J. Bishop, ed., Guide 
to Human Genome Computing, 2d Edition, Academic Press, San Diego, CA. (1998); and 
Leonard F. Peruski, Jr., and Anne Harwood Peruski, The Internet and the New Biology: 
Tools for Genomic and Molecular Research, American Society for Microbiology, 
20 Washington, D.C. (1997). 

Computer algorithms enable the identification of M. catarrhalis open reading frames 
(ORFs) within SEQ ID NO: 1 - SEQ ID NO: 1920 which contain homology to ORFs or 
proteins from other organisms. Examples of such similarity-search algorithms include the 
BLAST [Altschul et al., J. Mol. Biol. 215:403-410 (1990)] and Smith-Waterman [Smith and 
25 Waterman (1 98 1 ) Advances in Applied Mathematics, 2:482-489] search algorithms. 

Suitable search algorithms are described, for example, in Martin J. Bishop, ed., Guide to 
Human Genome Computing, 2d Edition, Academic Press, San Diego, CA. (1998); and 
Leonard F. Peruski, Jr., and Anne Harwood Peruski, The Internet and the New Biology: 
Tools for Genomic and Molecular Research, American Society for Microbiology, 
30 Washington, D.C. (1 997). Such algorithms are utilized on computer systems as exemplified 
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below. The ORFs so identified represent protein encoding fragments within the M. 
catarrhalis genome and M. catarrhalis plasmidsmd are useful in producing commercially 
important proteins such as enzymes used in fermentation reactions and in the production of • 
commercially useful metabolites. 
5 The present invention further provides systems, particularly computer-based systems, 

which contain the sequence information described herein: Such systems are designed to 
identify commercially important fragments of the M. catarrhalis genome and plasmids. As 
used herein, "a computer-based system" refers to the hardware means, software means, and 
data storage means used to analyze the nucleotide sequence information of the present 
1 0 invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A person skilled in the art can readily appreciate that any one of the currently 
available computer-based systems is suitable for use in the present invention. The computer- 
based systems of the present invention comprise a data storage means having stored therein a 
1 5 nucleotide sequence of the present invention and the necessary hardware means and software 
means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded 
thereon the nucleotide sequence information of the present invention. 
20 As used herein, "search means" refers to one or more programs which are 

implemented on the computer-based system to compare a target sequence or target structural 
motif with the sequence information stored within the data storage means. Search means are 
used to identify fragments or regions of the M. catarrhalis genome and plasmids which are 
similar to, or "match", a particular target sequence or target motif. A variety of known 
25 algorithms are known in the art and have been disclosed publicly, and a variety of 

commercially available software for conducting homology-based similarity searches are 
available and can be used in the computer-based systems of the present invention. Examples 
of such software includes, but is not limited to, FASTA (GCG Wisconsin Package), Bic_S W 
(Gompugen Bioccelerator), BLASTN2, BLASTP2, BLASTX2 (NCBI) and Motifs (GCG). 
30 Suitable software programs are described, for example, in Martin J. Bishop, ed., Guide to 
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Human Genome Computing, 2d Edition, Academic Press, San Diego, CA. (1998); and 
Leonard F. Peruski, Jr., and Anne Harwood Peruski, The Internet and the New Biology: 
Tools for Genomic and Molecular Research, American Society for Microbiology, 
Washington, D.C. (1997). A person skilled in the art can readily recognize that any one of 
5 the available algorithms or implementing software packages for conducting homology 
searches can be adapted for use in the present computer-based systems. 

As used herein, a "target sequence" can be any DNA or amino acid sequence of six or 
more nucleotides or two or more amino acids. A person skilled in the art can readily 
recognize that the longer a target sequence is, the less likely a target sequence will be present 
10 as a random occurrence in the database. The most preferred sequence length of a target 

sequence is from about 1 0 to 100 amino acids or from about 30 to 300 nucleotide residues. 
However, it is well recognized that many genes are longer than 500 amino acids, or 1 .5 kb in 
length, and that commercially important fragments of the M. catarrhalis genome and 
plasmids from M. catarrhalis, such as sequence fragments involved in gene expression and 
1 5 protein processing, will often be shorter than 3 0 nucleotides. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on 
a specific functional domain or three-dimensional configuration which is formed upon the 
folding of the target polypeptide. There are a variety of target motifs known in the art. 
20 Protein target motifs include, but are not limited to, enzymatic active sites, membrane- 
spanning regions, and signal sequences. Nucleic acid target motifs include, but are not 
limited to, promoter sequences, hairpin structures and inducible expression elements (protein 
binding sequences). 

A variety of structural formats for the input and output means can be used to input 
25 and output the information in the computer-based systems of the present invention. A 
preferred format for an output means ranks fragments of the M. catarrhalis genome and 
plasmids possessing varying degrees of homology to the target sequence or target motif. 
Such presentation provides a person skilled in the art with a ranking of sequences which 
contain various amounts of the target sequence or target motif and identifies Jhe degree of 
30 homology contained in the identified fragment. 
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A variety of comparing means can be used to compare a target sequence or target 
motif with the data storage means to identify sequence fragments of the A£ catarrhalis 
genome and plasmids. In the present examples, implementing software which implement the 
BLASTP2 and bic_SW algorithms (Altschul et al., J Mol. Biol. 215:403-410 (1990); 
5 Compugen Biocellerator) was used to identify open reading frames within the M catarrhalis 
genome and plasmids. A person skilled in the art can readily recognize that any one of the 
publicly available homology search programs can be used as the search means for the 
computer-based systems of the present invention. Suitable programs are described, for 
example, in Martin J. Bishop, ed., Guide to Human Genome Computing, 2d Edition, 
10 Academic Press, San Diego, CA. (1998); and Leonard F. Peruski, Jr., and Anne Harwood 
Peruski, The Internet and the New Biology: Tools for Genomic and Molecular Research, 
American Society for Microbiology, Washington, D.C. (1997). 

The invention features M. catarrhalis polypeptides, preferably a substantially pure 
preparation of an M catarrhalis polypeptide, or a recombinant M. catarrhalis polypeptide. 

15 In preferred embodiments: the polypeptide has biological activity; the polypeptide has an 
amino acid sequence at least about 60%, 70%, 80%, 90%, 95%, 98%, or 99% identical to an 
amino acid sequence of the invention contained in the Sequence Listing, preferably it has 
about 65% sequence identity with an amino acid sequence of the invention contained in the 
Sequence Listing, and most preferably it has about 92% to about 99% sequence identity with 

20 an amino acid sequence of the invention contained in the Sequence Listing; the polypeptide 
has an amino acid sequence essentially the same as an amino acid sequence of the invention 
contained in the Sequence Listing; the polypeptide is at least about 5, 10, 20, 50, 100, or 150 
amino acid residues in length; the polypeptide includes at least about 5, preferably at least 
about 10, more preferably at least about 20, still more preferably at least about 50, 100, or 

25 1 50 contiguous amino acid residues of the invention contained in the Sequence Listing. In 
yet another preferred embodiment, the amino acid sequence which differs in sequence 
identity by about 7% to about 8% from the M. catarrhalis amino acid sequences of the 
invention contained in the Sequence Listing is also encompassed by the invention. 

In preferred embodiments: the M. catarrhalis polypeptide is encodedby a nucleic 

30 acid of the invention contained in the Sequence Listing, or by a nucleic acid having at least 
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about 60%, 70%, 80%, 90%, 95%, 98%, or 99% sequence identity or % homology with a 
nucleic acid of the invention contained in the Sequence Listing. 

In a preferred embodiment, the subject M. catarrhalis polypeptide differs in amino 
acid sequence at about 1, 2, 3, 5, 10 or more residues from a sequence of the invention 
5 contained in the Sequence Listing. The differences, however, are such that the M 

catarrhalis polypeptide exhibits an M. catarrhalis biological activity, e.g., the M catarrhalis 
polypeptide retains a biological activity of a naturally occurring M. catarrhalis en2yme. 

In preferred embodiments, the polypeptide includes all or a fragment of an amino 
acid sequence of the invention contained in the Sequence Listing; fused, in reading frame, to 
1 0 additional amino acid residues, preferably to residues encoded by genomic DNA 5' or 3' to 
the genomic DNA which encodes a sequence of the invention contained in the Sequence 
Listing. 

In yet other preferred embodiments, the M. catarrhalis polypeptide is a recombinant 
fusion protein having a first M catarrhalis polypeptide portion and a second polypeptide 
15 portion, e.g., a second polypeptide portion having an amino acid sequence unrelated to M 
catarrhalis . The second polypeptide portion can be, e.g., any of glutathione-S-transferase, a 
DNA binding domain, or a polymerase activating domain. In preferred embodiment the 
fusion protein can be used in a two-hybrid assay. 

Polypeptides of the invention include those which arise as a result of alternative 
20 transcription events, alternative RNA splicing events, and alternative translational and 
postradiational events. 

In a preferred embodiment, the encoded M. catarrhalis polypeptide differs (e.g., by 
amino acid substitution, addition or deletion of at least one amino acid residue) in amino 
acid sequence at about 1, 2, 3, 5, 10 or more residues, from a sequence of the invention 
25 contained in the Sequence Listing. The differences, however, are such that: the M 

catarrhalis encoded polypeptide exhibits an M. catarrhalis biological activity, e.g., the 
encoded M. catarrhalis enzyme retains a biological activity of a naturally occurring M 
catarrhalis . 

In preferred embodiments, the encoded polypeptide includes all or a fragment of an 
30 amino acid sequence of the invention contained in the Sequence Listing; fused, in reading 
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frame, to additional amino acid residues, preferably to residues encoded by genomic DNA 5' 
or 3' to the genomic DNA which encodes a sequence of the invention contained in the 
Sequence Listing. 

The M. catarrhalis strain, 98-4362, from which genomic sequences have been 
5 sequenced, has been deposited on July 20, 1 998, in the American Type Culture Collection 
and assigned the ATCC designation # 202156. 

Included in the invention are: allelic variations; natural mutants; induced mutants; 
proteins encoded by DNA that hybridize under high or low stringency conditions to a nucleic 
acid which encodes a polypeptide of the invention contained in the Sequence Listing (for 
10 definitions of high and low stringency see Current Protocols in Molecular Biology, John 
Wiley & Sons, New York, 1989, 6.3.1 - 6.3.6, hereby incorporated by reference); and, 
polypeptides specifically bound by antisera to M. catarrhalis polypeptides, especially by 
antisera to an active site or binding domain of M, catarrhalis polypeptide. The invention 
also includes fragments, preferably biologically active fragments. These and other 
15 polypeptides are also referred to herein as M. catarrhalis polypeptide analogs or variants. 

The invention further provides nucleic acids, e.g., RNA or DNA and their respective 
complements, encoding a polypeptide of the invention. This includes double stranded 
nucleic acids as well as coding and antisense single strands. 

In preferred embodiments, the subject M. catarrhalis nucleic acid will include a 
20 transcriptional regulatory sequence, e.g., at least one of a transcriptional promoter or 

transcriptional enhancer sequence, operably linked to the M catarrhalis gene sequence, e.g., 
to render the M catarrhalis gene sequence suitable for expression in a recombinant host cell. 

In yet a further preferred embodiment, the nucleic acid which encodes an M 
catarrhalis polypeptide of the invention, hybridizes under stringent conditions to a nucleic 
25 acid probe corresponding to at least about 8 consecutive nucleotides of the invention 
contained in the Sequence Listing; more preferably to at least about 12 consecutive 
nucleotides of the invention contained in the Sequence Listing; still more preferably to at 
least about 20 consecutive nucleotides of the invention contained in the Sequence Listing; 
most preferably to at least about 40 consecutive nucleotides of the invention contained in the 
30 Sequence Listing. 
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In another aspect, the invention provides a substantially pure nucleic acid having a 
nucleotide sequence which encodes an M. catarrhalis polypeptide. In preferred 
embodiments: the encoded polypeptide has biological activity; the encoded polypeptide has 
an amino acid sequence at least about 60%, 70%, 80%, 90%, 95%, 98% or 99% homologous 
5 to an amino acid sequence of the invention contained in the Sequence Listing; the encoded 
polypeptide has an amino acid sequence essentially the same as an amino acid sequence of 
the invention contained in the Sequence Listing; the encoded polypeptide is at least about 5, 
10, 20, 50, 100, or 150 amino acids in length; the encoded polypeptide comprises at least 
about 5, preferably at least about 10, more preferably at least about 20, still more preferably 
1 0 at least about 50, 1 00, or 1 50 contiguous amino acids of the invention contained in the 
Sequence Listing. 

In another aspect, the invention encompasses: a vector including a nucleic acid 
which encodes an M. catarrhalis polypeptide or an M catarrhalis polypeptide variant as 
described herein; a host cell transfected with the vector; and a method of producing a 
1 5 recombinant M catarrhalis polypeptide or M catarrhalis polypeptide variant; including 
culturing the cell, e.g., in a cell culture medium, and isolating an M. catarrhalis or M. 
catarrhalis polypeptide variant, e.g., from the cell or from the cell culture medium. 

One embodiment of the invention is directed to substantially isolated nucleic acids. 
Nucleic acids of the invention include sequences comprising at least about 8 nucleotides in 
20 length, more preferably at least about 12 nucleotides in length, even more preferably at least 
about 15-20 nucleotides in length, that correspond to a subsequence of any one of SEQ ID 
NO: 1 - SEQ ID NO: 1920 or complements thereof Alternatively, the nucleic acids 
comprise sequences contained within any ORF (open reading frame), including a complete 
protein-coding sequence, of which any of SEQ ID NO: 1 - SEQ ID NO: 1920 forms a part. 
25 The invention encompasses sequence-conservative variants and function-conservative 

variants of these sequences. The nucleic acids may be DNA, RNA, DNA/RNA duplexes, 
protein-nucleic acid (PNA), or derivatives thereof. 

In another aspect, the invention features a purified recombinant nucleic acid having at 
, least about 50%, 60%, 70%, 80%, 90%, 95%, 98%, or 99% homology with a sequence of the 
30 invention contained in the Sequence Listing 
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The invention also encompasses recombinant DNA (including DNA cloning and 
expression vectors) comprising these M. catarrhalis -derived sequences; host cells 
comprising such DNA, including fungal, bacterial, yeast, plant, insect, and mammalian host 
cells; and methods for producing expression products comprising RNA and polypeptides 
5 encoded by the M. catarrhalis sequences. These methods are carried out by incubating a 
host cell comprising an M. catarrhalis -derived nucleic acid sequence under conditions in 
which the sequence is expressed. The host cell may be native or recombinant. The 
polypeptides can be obtained by (a) harvesting the incubated cells to produce a cell fraction 
and a medium fraction; and (b) recovering the M. catarrhalis polypeptide from the cell 
1 0 fraction, the medium fraction, or both. The polypeptides can also be made by in vitro 
translation. 

In another aspect, the invention features nucleic acids capable of binding mRNA of 
M. catarrhalis . Such nucleic acid is capable of acting as antisense nucleic acid to control 
the translation of mRNA of M. catarrhalis . A further aspect features a nucleic acid which is 
1 5 capable of binding specifically to an M. catarrhalis nucleic acid. These nucleic acids are 
also referred to herein as complements and have utility as probes and as capture reagents. 

In another aspect, the invention features an expression system comprising an open 
reading frame corresponding to M. catarrhalis nucleic acid. The nucleic acid further 
comprises a control sequence compatible with an intended host. The expression system is 
20 useful for making polypeptides corresponding to M. catarrhalis nucleic acid. 

In another aspect, the invention encompasses: a vector including a nucleic acid which 
encodes an M. catarrhalis polypeptide or an M. catarrhalis polypeptide variant as described 
herein; a host cell transfected with the vector; and a method of producing a recombinant M. 
catarrhalis polypeptide or M. catarrhalis polypeptide variant; including culturing the cell, 
25 e.g., in a cell culture medium, and isolating the M. catarrhalis or M. catarrhalis polypeptide 
variant, e.g., from the cell or from the cell culture medium. 

In yet another embodiment of the invention encompasses reagents for detecting 
bacterial infection, including M. catarrhalis infection, which comprise at least one M. 
catarrhalis -derived nucleic acid defined by any one of SEQ ID NO: 1 - SEQ JDNO: 1 920, 
30 or sequence-conservative or function-conservative variants thereof. Alternatively, the 
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diagnostic reagents comprise nucleotide sequences that are contained within any open 
reading frames (ORFs), including preferably complete protein-coding sequences, contained 
within any of SEQ ID NO: 1 - SEQ ID NO: 1920, or polypeptide sequences contained within 
any of SEQ ID NO: 1921 - SEQ ID NO: 3840, or polypeptides of which any of the above 
5 sequences forms a part, or antibodies directed against any of the above peptide sequences or 
function-conservative variants and/or fragments thereof. 

The invention further provides antibodies, preferably monoclonal antibodies, which 
specifically bind to the polypeptides of the invention. Methods are also provided for 
producing antibodies in a host animal. The methods of the invention comprise immunizing 
10 an animal with at least one M. catarrhalis -derived immunogenic component, wherein the 
immunogenic component comprises one or more of the polypeptides encoded by any one of 
SEQ ID NO: 1 - SEQ ID NO: 1920 or sequence-conservative or function-conservative 
variants thereof; or polypeptides that are contained within any ORFs, including complete 
protein-coding sequences, of which any of SEQ ID NO: 1 - SEQ ID NO: 1920 forms a part; 
1 5 or polypeptide sequences contained within any of SEQ ID NO : 1 92 1 - SEQ ID NO : 3 840; or 
polypeptides of which any of SEQ ID NO: 1 921 - SEQ ID NO: 3840 forms a part. Host 
animals include any warm blooded animal, including without limitation mammals and birds. 
Such antibodies have utility as reagents for immunoassays to evaluate the abundance and 
distribution of M. catarrhalis -specific antigens. 
20 In yet another aspect, the invention provides diagnostic methods for detecting M. 

catarrhalis antigenic components or anti-M catarrhalis antibodies in a sample. M. 
catarrhalis antigenic components may be detected by known processes, including but not 
limited to detection by a process comprising: (i) contacting a sample suspected to contain a 
bacterial antigenic component with a bacterial-specific antibody, under conditions in which a 
25 stable antigen-antibody complex can form between the antibody and bacterial antigenic 
components in the sample; and (ii) detecting any antigen-antibody complex formed in step 
(i), wherein detection of an antigen-antibody complex indicates the presence of at least one 
bacterial antigenic component in the sample. In different embodiments of this method, the 
antibodies used are directed against a sequence encoded by any of SEQ ID N£): 1 - SEQ ID 
30 NO: 1920 or sequence-conservative or function-conservative variants thereof, or against a 
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polypeptide sequence contained in any of SEQ ID NO: 1921 - SEQ ID NO: 3840 or 
function-conservative variants thereof. 

In yet another aspect, the invention provides a method for detecting antibacterial- 
specific antibodies in a sample, which comprises: (i) contacting a sample suspected to 
5 contain antibacterial-specific antibodies with an M. catarrhalis antigenic component, under 
conditions in which a stable antigen-antibody complex can form between the M catarrhalis 
antigenic component and antibacterial antibodies in the sample; and (ii) detecting any 
antigen-antibody complex 'formed in step (i), wherein detection of an antigen-antibody 
complex indicates the presence of antibacterial antibodies in the sample. In different 
10 embodiments of this method, the antigenic component is encoded by a sequence contained in 
any of SEQ ID NO: 1 - SEQ ID NO: 1920 or sequence-conservative and function- 
conservative variants thereof, or is a polypeptide sequence contained in any of SEQ ID NO: 
1921 - SEQ ID NO: 3840 or function-conservative variants thereof. 

In another aspect, the invention features a method of generating vaccines for 

15 immunizing an individual against M. catarrhalis . The method includes: immunizing a 
subject with an M. catarrhalis polypeptide, e.g., a surface or secreted polypeptide, or a 
combination of such peptides or active portion(s) thereof, and a pharmaceutically acceptable 
carrier. Such vaccines have therapeutic and prophylactic utilities. 

In another aspect, the invention features a method of evaluating a compound, e.g., a 

20 polypeptide, e.g., a fragment of a host cell polypeptide, for the ability to bind an M. 

catarrhalis polypeptide. The method includes contacting the compound to be evaluated with 
an M. catarrhalis polypeptide and determining if the compound binds or otherwise interacts 
with the M. catarrhalis polypeptide. Compounds which bind or otherwise interact with M. 
catarrhalis polypeptides are candidates as modulators, including activators and inhibitors, of 

25 the bacterial life cycle. These assays can be performed in vitro or in vivo. 

In another aspect, the invention features a method of evaluating a compound, e.g., a 
polypeptide, e.g., a fragment of a host cell polypeptide, for the ability to bind an M. 
catarrhalis nucleic acid, e.g., DNA or RNA. The method includes contacting the compound 
to be evaluated with an M. catarrhalis nucleic acid and determining if the compound binds 

30 or otherwise interacts with the M. catarrhalis nucleic acid. Compounds which bind M. 
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catarrhalis are candidates as modulators, including activators and inhibitors, of the bacterial 
life cycle. These assays can be performed in vitro or in vivo. 

A particularly preferred embodiment of the invention is directed to a method of 
screening test compounds for anti-bacterial activity, which method comprises: selecting as a 
5 target a bacterial specific sequence, which sequence is essential to the viability of a bacterial 
species; contacting a test compound with said target sequence; and selecting those test 
compounds which bind to said target sequence as potential anti-bacterial candidates. In one 
embodiment, the target sequence selected is specific to a single species, or even a single 
strain, such as, for example, the strain M. catarrhalis9&-4362. In a second embodiment, the 
10 target sequence is common to at least two species of bacteria. In a third embodiment, the 
target sequence is common to a family of bacteria. The target sequence may be a nucleic 
acid sequence or a polypeptide sequence. Methods employing sequences common to more 
than one species of microorganism may be used to screen candidates for broad spectrum 
anti-bacterial activity. 

1 5 The invention also provides methods for preventing or treating disease caused by 

certain bacteria, including M catarrhalis , which are carried out by administering to an 
animal in need of such treatment, in particular a warm-blooded vertebrate, including but not 
limited to birds and mammals, a compound that specifically inhibits or interferes with the ' 
function of a bacterial polypeptide or nucleic acid. In a particularly preferred embodiment, 

20 the mammal to be treated is human. 

DETAILED DESCRIPTION OF THE INVENTION 

The sequences of the present invention include the specific nucleic acid and amino 
acid sequences set forth in the Sequence Listing that forms a part of the present specification, 
25 and which are designated SEQ ID NO: 1 - SEQ ID NO: 3840. Use of the terms n SEQ ID 
NO: 1 - SEQ ID NO: 1920 ", " SEQ ID NO: 1921 - SEQ ID NO: 3840, "the sequences 
depicted in Table 2", etc., is intended, for convenience, to refer to each individual SEQ ID 
NO individually, and is not intended to refer to the genus of these sequences unless such 
reference would be indicated. In other words, it is a shorthand for listing all of these 
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sequences individually. The invention encompasses each sequence individually, as well as 
any combination thereof. 

DEFINITIONS 

5 "Nucleic acid" or "polynucleotide" as used herein refers to purine- and pyrimidine- 

containing polymers of any length, either polyribonucleotides or polydeoxyribonucleotides 
or mixed polyribo-polydeoxyribo nucleotides. This includes single- and double-stranded 
molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNA hybrids, as well as "protein nucleic 
acids" (PNA) formed by conjugating bases to an amino acid backbone. This also includes 

1 0 nucleic acids containing modified bases. 

A nucleic acid or polypeptide sequence that is "derived from" a designated sequence 
refers to a sequence that corresponds to a region of the designated sequence. For nucleic 
acid sequences, this encompasses sequences that are homologous or complementary to the 
sequence, as well as "sequence r conservative variants" and "function-conservative variants." 

1 5 For polypeptide sequences, this encompasses "function-conservative variants." Sequence- 
conservative variants are those in which a change of one or more nucleotides in a given 
codon position results in no alteration in the amino acid encoded at that position. Function- 
conservative variants are those in which a given amino acid residue in a polypeptide has 
been changed without altering the overall conformation and function of the native 

20 polypeptide, including, but not limited to, replacement of an amino acid with one having 
similar physico-chemical properties (such as, for example, acidic, basic, hydrophobic, and 
the like). "Function-conservative" variants also include any polypeptides that have the 
ability to elicit antibodies specific to a designated polypeptide. 

An "M catarrhalis -derived" nucleic acid or polypeptide sequence may or may not be 

25 present in other bacterial species, and may or may not be present in all M catarrhalis strains. 
This term is intended to refer to the source from which the sequence was originally isolated. 
Thus, an M. catarrhalis -derived polypeptide, as used herein, may be used, e.g., as a target to 
screen for a broad spectrum antibacterial agent, to search for homologous proteins in other 
species of bacteria or in eukaryotic organisms such as bacteria humans, etc. r 
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A purified or isolated polypeptide or a substantially pure preparation of a polypeptide 
are used interchangeably herein and, as used herein, mean a polypeptide that has been 
separated from other proteins, lipids, and nucleic acids with which it naturally occurs. 
Preferably, the polypeptide is also separated from substances, e.g., antibodies or gel matrix, 
5 e.g., polyacrylamide, which are used to purify it. Preferably, the polypeptide constitutes at 
least about 10, 20, 50 70, 80 or 95% dry weight of the purified preparation. Preferably, the 
preparation contains sufficient polypeptide to allow protein sequencing; at least about 1,10, 
or preferably 100 mg of polypeptide. 

A purified preparation of cells refers to, in the case of plant or animal cells, an in 

10 vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells 
or microbial cells, it consists of a preparation of at least about 10%, more preferably at least 
about 50%, of the subject cells. 

A purified or isolated or a substantially pure nucleic acid, e.g., a substantially pure 
DNA, (are terms used interchangeably herein) is a nucleic acid which is one or both of the 

15 following: not immediately contiguous with both of the coding sequences with which it is 
immediately contiguous (i.e., one at the 5' end and one at the 3' end) in the naturally- 
occurring genome and plasmids of the organism from which the nucleic acid is derived; or 
which is substantially free of a nucleic acid with which it occurs in the organism from which 
the nucleic acid is derived. The term includes, for example, a recombinant DNA which is 

20 incorporated into a vector, e.g., into an autonomously replicating plasmid or virus, or into the 
genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a 
cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) 
independent of other DNA sequences. Substantially pure DNA also includes a recombinant 
DNA which is part of a hybrid gene encoding additional M. catarrhalis DNA sequence. 

25 A f, contig M as used herein is a nucleic acid representing a continuous stretch of 

genomic sequence of an organism. 

An "open reading frame", also referred to herein as ORF, is a region of nucleic acid 
which encodes a polypeptide. This region may represent a portion of a coding sequence or a 
total sequence and can be determined from a stop to stop codon or from a start to stop codon. 
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As used herein, a "coding sequence" is a nucleic acid which is transcribed into 
messenger RNA and/or translated into a polypeptide when placed under the control of 
appropriate regulatory sequences. The boundaries of the coding sequence are determined by 
a translation start codon at the five prime terminus and a translation stop code at the three 
5 prime terminus. A coding sequence can include but is not limited to messenger RNA, 
synthetic DNA, and recombinant nucleic acid sequences. 

A "complement" of a nucleic acid as used herein refers to an anti-parallel or antisense 
sequence that participates in Watson-Crick base-pairing with the original sequence. 

A "gene product" is a protein or structural RNA which is specifically encoded by a 

10 gene. 

As used herein, the term "probe" refers to a nucleic acid, peptide or other chemical 
entity which specifically binds to a molecule of interest. Probes are often associated with or 
capable of associating with a label. A label is a chemical moiety capable of detection. 
Typical labels comprise dyes, radioisotopes, luminescent and chemiluminescent moieties, 

15 fluorophores, enzymes, precipitating agents, amplification sequences, and the like. 

Similarly, a nucleic acid, peptide or other chemical entity which specifically binds to a 
molecule of interest and immobilizes such molecule is referred herein as a "capture ligand". 
Capture ligands are typically associated with or capable of associating with a support such as 
nitro-cellulose, glass, nylon membranes, beads, particles and the like. The specificity of 

20 hybridization is dependent on conditions such as the base pair composition of the 

nucleotides, and the temperature and salt concentration of the reaction. These conditions are 
readily discernable to one of ordinary skill in the art using routine experimentation. 

"Homologous" refers to the sequence similarity or sequence identity between two 
polypeptides or between two nucleic acid molecules. When a position in both of the two 

25 compared sequences is occupied by the same base or amino acid monomer subunit, e.g., if a 
position in each of two DNA molecules is occupied by adenine, then the molecules are 
homologous at that position. The percent of homology between two sequences is a function 
of the number of matching or homologous positions shared by the two sequences divided by 
the number of positions compared x 100. For example, if 6 of 10 of the positions in two 

30 sequences are matched or homologous then the two sequences are 60% homologous. By 
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way of example, the DNA sequences ATTGCC and TATGGC share 50% homology. 
Generally, a comparison is made when two sequences are aligned to give maximum 
homology. 

Nucleic acids are hybridizable to each other when at least one strand of a nucleic acid 
5 can anneal to the other nucleic acid under defined stringency conditions. Stringency of 

hybridization is determined by: (a) the temperature at which hybridization and/or washing is 
performed; and (b) the ionic strength and polarity of the hybridization and washing solutions. 
Hybridization requires that the two nucleic acids contain complementary sequences; 
depending on the stringency of hybridization, however, mismatches may be tolerated. 
10 Typically, hybridization of two sequences at high stringency (such as, for example, in a 
solution of 0.5X SSC, at 65° C) requires that the sequences be essentially completely 
homologous. Conditions of intermediate stringency (such as, for example, 2X SSC at 65 0 
C) and low stringency (such as, for example 2X SSC at 55° C) require correspondingly less 
overall complementarity between the hybridizing sequences. (IX SSC is 0.15 M NaCl, 
15 0.015 MNa citrate). 

The terms peptides, proteins, and polypeptides are used interchangeably herein. 

As used herein, the term "surface protein" refers to all surface accessible proteins, 
e.g. inner and outer membrane proteins, proteins adhering to the cell wall, and secreted 
proteins. 

20 A polypeptide has M. catarrhalis biological activity if it has one, two or preferably 

more of the following properties: (1) if when expressed in the course of an M catarrhalis 
infection, it can promote, or mediate the attachment of M catarrhalis to a cell; (2) it has an 
enzymatic activity, structural or regulatory function characteristic of an M catarrhalis 
protein; (3) the gene which encodes it can rescue a lethal mutation in an M catarrhalis gene. 

25 A polypeptide has biological activity if it is an antagonist, agonist, or super-agonist of a 
polypeptide having one of the above-listed properties. 

A biologically active fragment or analog is one having an in vivo or in vitro activity 
which is characteristic of the M. catarrhalis polypeptides of the invention contained in the 
Sequence Listing, or of other naturally occurring M, catarrhalis polypeptides^ e.g., one or 

30 more of the biological activities described herein. Especially preferred are' fragments which 
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exist in vivo, e.g., fragments which arise from post transcriptional processing or which arise 
from translation of alternatively spliced RNA's. Fragments include those expressed in native 
or endogenous cells as well as those made in expression systems, e.g., in CHO (Chinese 
Hamster Ovary) cells. Because peptides such as M catarrhalis polypeptides often exhibit a 
5 range of physiological properties and because such properties may be attributable to different 
portions of the molecule, a useful M. catarrhalis fragment or M. catarrhalis analog is one 
which exhibits a biological activity in any biological assay for M catarrhalis activity. The 
fragment or analog possesses about 10%, preferably about 40%, more preferably about 60%, 
70%, 80% or 90% or greater of the activity of M. catarrhalis , in any in vivo or in vitro 
10 assay. 

Analogs can differ from naturally occurring M catarrhalis polypeptides in amino 
acid sequence or in ways that do not involve sequence, or both. Non-sequence modifications 
include changes in acetylation, methylation, phosphorylation, carboxylation, or 
glycosylation. Preferred analogs include M. catarrhalis polypeptides (or biologically active 

1 5 fragments thereof) whose sequences differ from the wild-type sequence by one or more 
conservative amino acid substitutions or by one or more non-conservative amino acid 
substitutions, deletions, or insertions which do not substantially diminish the biological 
activity of the M catarrhalis polypeptide. Conservative substitutions typically include the 
substitution of one amino acid for another with similar characteristics, e.g., substitutions 

20 within the following groups: valine, glycine; glycine, alanine; valine, isoleucine, leucine; 
aspartic acid, glutamic acid; asparagine, glutamine; serine, threonine; lysine, arginine; and 
phenylalanine, tyrosine. Other conservative substitutions can be made in view of the table 
below. 

25 TABLE 1 

CONSERVATIVE AMINO ACID REPLACEMENTS 



For Amino Acid 


Code 


Replace with any of 


Alanine 


A 


D-Ala, Gly, beta-Ala, L-Cys, D-Cys 1 
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Argimne 


Tj 

K 


D-Arg, Lys, D-Lys, homo-Arg, D-homo-Arg, Met, He, 
D-Met, D-Ile, Orn, D-Orn 


Asparagine 


N 


D-Asn, Asp, D-Asp, Glu, D-Glu, Gin, D-Gln 


Aspartic Acid 


D - 


D-Asp, D-Asn, Asn, Glu, D-Glu, Gin, D-Gln 


Cysteine 


C 


D-Cys, S-Me-Cys, Met, D-Met, Thr, D-Thr 


Glutamine 


Q 


D-Gln, Asn, D-Asn, Ght, D-Glu, Asp, D-Asp 


Glutamic Acid 


E 


D-Glu, D-Asp, Asp, Asn, D-Asn, Gin, D-Gln 


Glycine 


G 


Ala, D-Ala, Pro, D-Pro, (3-Ala, Acp 


Isoleucine 


I 


D-Ile, Val, D-Val, Leu, D-Leu, Met, D-Met 


Leucine 


L 


D-Leu Val D-Val T ph D-T mi \Aet D lU<»t 


Lvsine 


K 


u-j-^ys, Arg, jj-/\rg, nomo-Arg, jj-nomo-Arg, Met, u- 
Met, He, D-Ile, Orn, D-Orn 


Methionine 


M 


D-Met Ss-IVTp-Pvq Tip Fi-Tl^ T #»u Pi T »n Vol n 
ivicl, o mc-^yb, iic, .u-iie, jueu, jj-joeu, vai, u-vai 


Phenylalanine 


F 


D-Phe, Tyr, D-Thr, L-Dopa, His, D-His, Trp, D-Trp, 
Trans-3,4, or 5-phenylproline, cis-3,4, or 5- 
phenylproline 


Proline 


P 


D-Pro, L-I-thioazolidine-4-carboxylic acid, D-or L-l- 
oxazolidine-4-carboxylic acid 


Serine 


s 


D-Ser, Thr, D-Thr, allo-Thr, Met, D-Met, Met(0), 
D-Met(O), L-Cys, D-Cys 


Threonine 


T 


D-Thr, Ser, D-Ser, allo-Thr, Met, D-Met, Met(0), 
D-Met(O), Val, D-Val 


Tyrosine 


Y 


D-Tyr, Phe, D-Phe, L-Dopa, His, D-His 


Valine 


V 


D-Val, Leu, D-Leu, He, D-Ile, Met, D-Met , 
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Other analogs within the invention are those with modifications which increase 
peptide stability; such analogs may contain, for example, one or more non-peptide bonds 
(which replace the peptide bonds) in the peptide sequence. Also included are: analogs that 
include residues other than naturally occurring L-amino acids, e.g., D-amino acids or non- 
5 naturally occurring or synthetic amino acids, e.g., p or y amino acids; and cyclic analogs. 

As used herein, the term ''fragment", as applied to an M catarrhalis analog, will 
ordinarily be at least about 20 residues, more typically at least about 40 residues, preferably 
at least about 60 residues in length. Fragments of M. catarrhalis polypeptides can be 
generated by methods known to those skilled in the art. The ability of an Moraxella 
1 0 fragment to exhibit a biological activity of M. catarrhalis polypeptide can be assessed by 
methods known to those skilled in the art as described herein. Also included are M 
catarrhalis polypeptides containing residues that are not required for biological activity of 
the peptide or that result from alternative mRNA splicing or alternative protein processing 
events. 

15 An "immunogenic component" as used herein is a moiety, such as an M catarrhalis 

polypeptide, analog or fragment thereof, that is capable of eliciting a humoral and/or cellular 
immune response in a host animal. 

An "antigenic component" as used herein is a moiety, such as an M. catarrhalis 
polypeptide, analog or fragment thereof, that is capable of binding to a specific antibody with 
20 sufficiently high affinity to form a detectable antigen-antibody complex. 

The term "antibody" as used herein is intended to include fragments thereof which 
are specifically reactive with M catarrhalis polypeptides. 

As used herein, the term "cell-specific promoter" means a DNA sequence that serves 
as a promoter, i.e., regulates expression of a selected DNA sequence operably linked to the 
25 promoter, and which effects expression of the selected DNA sequence in specific cells of a 
tissue. The term also covers so-called "leaky" promoters, which regulate expression of a 
selected DNA primarily in one tissue, but cause expression in other tissues as well. 

Misexpression, as used herein, refers to a non-wild type pattern of gene expression. 
It includes: expression at non-wild type levels, i.e., over or under expression;^ pattern of 
30 expression that differs from wild type in terms of the time or stage at which the gene is 
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expressed, e.g., increased or decreased expression (as compared with wild type) at a 
predetermined developmental period or stage; a pattern of expression that differs from wild 
type in terms of increased expression (as compared with wild type) in a predetermined cell 
type or tissue type; a pattern of expression that differs from wild type in terms of the splicing 
5 size, amino acid sequence, post-translational modification, or biological activity of the 
expressed polypeptide; a pattern of expression that differs from wild type in terms of the 
effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., 
a pattern of increased or decreased expression (as compared with wild type) in the presence 
of an increase or decrease in the strength of the stimulus. 
10 As used herein, "host cells" and other such terms denoting microorganisms or higher 

eukaryotic cell lines cultured as unicellular entities refers to cells which can become or have 
been used as recipients for a recombinant vector or other transfer DNA, and include the 
progeny of the original cell which has been transfected. It is understood by individuals 
skilled in the art that the progeny of a single parental cell may not necessarily be completely 
fc 15 identical in genomic or total DNA compliment to the original parent, due to accident or 
deliberate mutation. 

As used herein, the term "control sequence" refers to a nucleic acid having a base 
sequence which is recognized by the host organism to effect the expression of encoded 
sequences to which they are ligated. The nature of such control sequences differs depending 

20 upon the host organism; in prokaryotes, such control sequences generally include a promoter, 
ribosomal binding site, terminators, and in some cases operators; in eukaryotes, generally 
such control sequences include promoters, terminators and in some instances, enhancers. The 
term control sequence is intended to include at a minimum, all components whose presence 
is necessary for expression, and may also include additional components whose presence is 

25 advantageous, for example, leader sequences. 

As used herein, the term "operably linked" refers to sequences joined or ligated to 
function in their intended manner. For example, a control sequence is operably linked to 
coding sequence by ligation in such a way that expression of the coding sequence is achieved 
under conditions compatible with the control sequence and host cell. 
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The "metabolism" of a substance, as used herein, means any aspect of the expression, 
function, action, or regulation of the substance. The metabolism of a substance includes 
modifications, e.g., covalent or non-covalent modifications of the substance. The metabolism 
of a substance includes modifications, e.g., covalent or non-covalent modification, the 
5 substance induces in other substances. The metabolism of a substance also includes changes 
in the distribution of the substance. The metabolism of a substance includes changes the 
substance induces in the distribution of other substances. 

A "sample 11 as used herein refers to a biological sample, such as, for example, tissue 
or fluid isloated from an individual (including without limitation plasma, serum, 

10 cerebrospinal fluid, lymph, tears, saliva and tissue sections) or from in vitro cell culture 
constituents, as well as samples from the environment. 

Technical and scientific terms used herein have the meanings commonly understood 
by one of ordinary skill in the art to which the present invention pertains, unless otherwise 
defined. Reference is made herein to various methodologies known to those of skill in the 

15 art. Publications and other materials setting forth such known methodologies to which 

reference is made are incorporated herein by reference in their entireties as though set forth 
in full. The practice of the invention will employ, unless otherwise indicated, conventional 
techniques of chemistry, molecular biology, microbiology, recombinant DNA, and 
immunology, which are within the skill of the art. Such techniques are explained fully in the 

20 literature. See e.g., Sambrook, Fritsch, and Maniatis, Molecular Cloning; Laboratory 
Manual 2nd ed. (1989); DNA Cloning, Volumes I and H (D.N Glover ed. 1985); 
Oligonucleotide Synthesis (M.J. Gait ed, 1984); Nucleic Acid Hybridization (B.D. Hames & 
SJ, Higgins eds. 1984); the series, Methods in Enzymoloqy (Academic Press, Inc.), 
particularly Vol. 154 and Vol. 155 (Wu and Grossman, eds.); PCR-A Practical Approach 

25 (McPherson, Quirke, and Taylor, eds., 1991); Immunology, 2d Edition, 1989, Roitt et ah, 
C.V. Mosby Company, and New York; Advanced Immunology, 2d Edition, 1991, Male et 
ah, Grower Medical Publishing, New York.; DNA Cloning: A Practical Approach, Volumes 
I and II, 1985 (D.N. Glover ed.); Oligonucleotide Synthesis, 1984, (MX. Gait ed); 
Transcription and Translation, 1984 (Hames and Higgins eds.); Animal CellCulture, 1986 

30 (R.I. Freshney ed.); Immobilized Cells and Enzymes, 1986 (IRL Press); Perbal, 1984, A 
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Practical Guide to Molecular Cloning; Gene Transfer Vectors for Mammalian Cells, 1987 
(J. H. Miller and M. P. Calos eds., Cold Spring Harbor Laboratory); Martin J. Bishop, ed., 
Guide to Human Genome Computing, 2d Edition, Academic Press, San Diego, CA. (1998); 
and Leonard F. Peruski, Jr., and Anne Harwood Peruski, The Internet and the New Biology: 
5 Tools for Genomic and Molecular Research, American Society for Microbiology, 
Washington, D.C. ( 1 997). 

Any suitable materials and/or methods known to those of skill can be utilized in 
carrying out the present invention; hoWever, preferred materials and/or methods are 
described. Materials, reagents and the like to which reference is made in the following 
1 0 description and examples are obtainable from commercial sources, unless otherwise noted. 

M. CATARRHALIS GENOMIC SEQUENCE 

This invention provides nucleotide sequences of the genome of M catarrhalis which 

thus comprises a DNA sequence library of M. catarrhalis genomic DNA. The detailed 
15 description that follows provides nucleotide sequences of M. catarrhalis , and also describes 

how the sequences were obtained and how ORFs and protein-coding sequences were 

identified. Also described are compositions and methods of using the disclosed M. 

catarrhalis sequences in methods including diagnostic and therapeutic applications. 

Furthermore, the library can be used as a database for identification and comparison of 
20 medically important sequences in this and other strains of M catarrhalis . 

To determine the genomic sequence of M. catarrhalis, DNA from strain 98-4362. of 

M catarrhalis was isolated and a library of DNA fragments were transformed into DH5a 

cells. DNA sequencing was achieved using established ABI sequencing methods on ABI377 

automated DNA sequencers. The cloning and sequencing procedures are described in more 
25 detail in the Exemplification. 

Individual sequence reads were assembled using PHRAP (P. Green, Abstracts of 

DOE Human Genome Program Contractor-Grantee Workshop V, Jan. 1996, p. 157). The 

average contig length was about 3-4 kb. 
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All subsequent steps were based on sequencing by ABI377 automated DNA 
sequencing methods. The cloning and sequencing procedures are described in more detail in 
the Exemplification. 

A variety of approaches may be used to order the contigs so as to obtain a continuous 
5 sequence representing the entire M catarrhalis genome. Synthetic oligonucleotides are 
designed that are complementary to sequences at the end of each contig. These 
oligonucleotides may be hybridized to libaries of M catarrhalis genomic DNA in, for 
example, lambda phage vectors or plasmid vectors to identify clones that contain sequences 
corresponding to the junctional regions between individual contigs. Such clones are then 
10 used to isolate template DNA and the same oligonucleotides are used as primers in 

polymerase chain reaction (PCR) to amplify junctional fragments, the nucleotide sequence of 
which is then determined. 

The M. catarrhalis sequences were analyzed for the presence of open reading frames 
(ORFs) comprising at least 18Q nucleotides. As a result of the analysis of ORFs based on 
15 stop-to-stop codon reads, it should be understood that these ORFs may not correspond to the 
ORF of a naturally-occurring M catarrhalis polypeptide. These ORFs may contain start 
codons which indicate the initiation of protein synthesis of a naturally-occurring M. 
catarrhalis polypeptide. Such start codons within the ORFs provided herein were identified 
by those of ordinary skill in the relevant art, and the resulting ORF and the encoded M. 
20 catarrhalis polypeptide is within the scope of this invention. For example, within the ORFs 
a codon such as AUG or GUG (encoding methionine or valine) which is part of the initiation 
signal for protein synthesis were identified and the portion of an ORF to corresponding to a 
naturally-occurring M. catarrhalis polypeptide was recognized. The predicted coding 
regions were defined by evaluating the coding potential of such sequences with the program 
25 GENEMARK™ (Borodovsky and Mclninch, 1 993 , Comp. .17:123). 

Each predicted ORF amino acid sequence was compared with all sequences found in 
current GENBANK, SWISS-PROT, and PIR databases using the BLAST algorithm. BLAST 
identifies local alignments occurring by chance between the ORF sequence and the sequence 
in the databank (Altschal et al., 1990, L Mol. Biol. 215:403-410). Homologous ORFs 
30 (probabilities less than 10" 5 by chance) andORF's that are probably non-homologous 
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(probabilities greater than 10~ 5 by chance) but have good codon usage were identified. Both 
homologous, sequences and non-homologous sequences with good codon usage, are likely to 
encode proteins and are encompassed by the invention. 

5 M. CATARRHALIS NUCLEIC ACIDS 

The present invention provides a library of M. catarrhalis -derived nucleic acid 
sequences. The libraries provide probes, primers, and markers which are used as markers in 
epidemiological studies. The present invention also provides a library of M. catarrhalis - 
derived nucleic acid sequences which comprise or encode targets for therapeutic drugs. 

10 The nucleic acids of this invention may be obtained directly from the DNA of the 

above referenced M. catarrhalis strain by using the polymerase chain reaction (PCR). See 
"PCR, A Practical Approach" (McPherson, Quirke, and Taylor, eds., IRL Press, Oxford, 
UK, 1991) for details about the PCR. High fidelity PCRis used to ensure a faithfiil DNA 
copy prior to expression. In addition, the authenticity of amplified products is verified by 

15 conventional sequencing methods. Clones carrying the desired sequences described in this 
invention may also be obtained by screening the libraries by means of the PCR or by 
hybridization of synthetic oligonucleotide probes to filter lifts of the library colonies or 
plaques as known in the art (see, e.g., Sambrook et al., Molecular Cloning, A Laboratory 
Manual 2nd edition, 1989, Cold Spring Harbor Press, NY). 

20 It is also possible to obtain nucleic acids encoding M. catarrhalis polypeptides from a 

cDNA library in accordance with protocols herein described. A cDNA encoding an M 
catarrhalis polypeptide can be obtained by isolating total mRNA from an appropriate strain. 
Double stranded cDNAs can then be prepared from the total mRNA. Subsequently, the 
. cDNAs can be inserted into a suitable plasmid or viral (e.g., bacteriophage) vector using any 

25 one of a number of known techniques. Genes encoding M. catarrhalis polypeptides can also 
be cloned using established polymerase chain reaction techniques in accordance with the 
nucleotide sequence information provided by the invention. The nucleic acids of the 
- invention can be DNA or RNA. Preferred nucleic acids of the invention are contained in the 
Sequence Listing. 
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The nucleic acids of the invention can also be chemically synthesized using standard 
techniques. Various methods of chemically synthesizing polydeoxynucleotides are known, 
including solid-phase synthesis which, like peptide synthesis, has been fully automated in 
commercially available DNA synthesizers (See e.g., Itakura et al. U.S. Patent No. 4,598,049; 
5 Caruthers et al. U.S. Patent No. 4,458,066; and Itakura U.S. Patent Nos. 4,401,796 and 
4,373,071, incorporated by reference herein). 

In another example, DNA can be chemically synthesized using, e.g., the 
phosphoramidite solid support method of Matteucci et ah, 1981, J, Am. Chem. Soc. u 
103:3185, the method of Yoo et a/., 1 989, J. Biol Chem. 764: 1 7078, or other well known 
10 methods. This can be done by sequentially linking a series of oligonucleotide cassettes 
comprising pairs of synthetic oligonucleotides, as described below. 

Nucleic acids isolated or synthesized in accordance with features of the present 
invention are useful, by way of example, without limitation, as probes, primers, capture 
ligands, antisense genes and for developing expression systems for the synthesis of proteins 
15 and peptides corresponding to such sequences. As probes, primers, capture ligands and 
antisense agents, the nucleic acid normally consists of all or part (approximately twenty or 
more nucleotides for specificity as well as the ability to form stable hybridization products) 
of the nucleic acids of the invention contained in the Sequence Listing. These uses are 
described in further detail below. 

20 

PROBES 

A nucleic acid isolated or synthesized in accordance with the sequence of the 
invention contained in the Sequence Listing can be used as a probe to specifically detect M 
catarrhalis . With the sequence information set forth in the present application, sequences 
25 of twenty or more nucleotides are identified which provide the desired inclusivity and 
exclusivity with respect to M. catarrhalis , and extraneous nucleic acids likely to be 
encountered during hybridization conditions. More preferably, the sequence will comprise at 
least about twenty to thirty nucleotides to convey stability to the hybridization product 
formed between the probe and the intended target molecules. 
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Sequences larger than 1 000 nucleotides in length are difficult to synthesize but can be 
generated by recombinant DNA techniques. Individuals skilled in the art will readily 
recognize that the nucleic acids, for use as probes, can be provided with a label to facilitate 
detection of a hybridization product. 
5 Nucleic acid isolated and synthesized in accordance with the sequence of the 

invention contained in the Sequence Listing can also be useful as probes to detect 
homologous regions (especially homologous genes) of other Moraxella species using 
appropriate stringency hybridization conditions as described herein. 

10 CAPTURE LIGAND 

For use as a capture ligand, the nucleic acid selected in the manner described above 
with respect to probes, can be readily associated with a support. The manner in which 
nucleic acid is associated with supports is well known. Nucleic acid having twenty or more 
nucleotides in a sequence of the invention contained in the Sequence Listing have utility to 

1 5 separate M. catarrhalis nucleic acid from one strain from the nucleic acid of other another 
strain as well as from other organisms. Nucleic acid having twenty or more nucleotides in a 
sequence of the invention contained in the Sequence Listing can also have utility to separate 
other Moraxella species from each other and from other organisms. Preferably, the sequence 
will comprise at least about twenty nucleotides to convey stability to the hybridization 
20 product formed between the probe and the intended target molecules. Sequences larger than 
1000 nucleotides in length are difficult to synthesize but can be generated by recombinant 
DNA techniques. 

PRIMERS 

25 Nucleic acid isolated or synthesized in accordance with the sequences described 

herein have utility as primers for the amplification of M. catarrhalis nucleic acid. These 
nucleic acids may also have utility as primers for. the amplification of nucleic acids in other 
Moraxella species. With respect to polymerase chain reaction (PCR) techniques; nucleic 
acid sequences of > 10-15 nucleotides of the invention contained in the Sequence Listing 

30 have utility in conjunction with suitable enzymes and reagents to create copies of M. 



-29- 



Applicant's Docket No.: PATH03-14 



catarrhalis nucleic acid. More preferably, the sequence will comprise twenty or more 
nucleotides to convey stability to the hybridization product formed between the primer and 
the intended target molecules. Binding conditions of primers greater than 100 nucleotides 
are more difficult to control to obtain specificity. High fidelity PCR can be used to ensure a 
5 faithful DNA copy prior to expression. In addition, amplified products can be checked by 
conventional sequencing methods. 

The copies can be used in diagnostic assays to detect specific sequences, including 
genes from M. catarrhalis and/or other Moraxella species. The copies can also be 
incorporated into cloning and expression vectors to generate polypeptides corresponding to 
1 0 the nucleic acid synthesized by PCR, as is described in greater detail herein. 

The nucleic acids of the present invention fmd use as templates for the recombinant 
production of M. catarrhalis -derived peptides or polypeptides 

ANTISENSE 

1 5 Nucleic acid or nucleic acid-hybridizing derivatives isolated or synthesized in 

accordance with the sequences described herein have utility as antisense agents to prevent 
the expression of M. catarrhalis genes. These sequences also have utility as antisense agents 
to prevent expression of genes of other Moraxella species. 

In one embodiment, nucleic acid or derivatives corresponding to M. catarrhalis 

20 nucleic acids is loaded into a suitable carrier such as a liposome or bacteriophage for 
introduction into bacterial cells. For example, a nucleic acid having twenty or more 
nucleotides is capable of binding to bacteria nucleic acid or bacteria messenger RNA. 
Preferably, the antisense nucleic acid is comprised of 20 or more nucleotides to provide 
necessary stability of a hybridization product of non-naturally occurring nucleic acid and 

25 bacterial nucleic acid and/or bacterial messenger RNA. Nucleic acid having a sequence 
greater than 1 000 nucleotides in length is difficult to synthesize but can be generated by 
recombinant DNA techniques. Methods for loading antisense nucleic acid in liposomes is 
known in the art as exemplified by U.S. Patent 4,241,046 issued December 23, 1980 to 
Papahadjopoulos et al. 
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The present invention encompasses isolated polypeptides and nucleic acids derived 
from M. catarrhalis that are useful as reagents for diagnosis of bacterial infection, 
components of effective anti-bacterial vaccines, and/or as targets for anti-bacterial drugs, 
including anti-M catarrhalis drugs. 

5 

EXPRESSION OF M CATARRHALIS 'NUCLEIC ACIDS 

Table 2, which is appended herewith and which forms part of the present 
specification, provides a list of open reading frames (ORFs) in both strands and a putative 
identification of the particular function of a polypeptide which is encoded by each ORF, 

10 based on the homology match (determined by the BLASTP2 algorithm) of the predicted 

polypeptide with known proteins encoded by ORFs in other organisms. An ORF is a region 
of nucleic acid which encodes a polypeptide. This region may represent a portion of a 
coding sequence or a total sequence and was determined from stop to stop codons. The first 
column contains a designation for the ORF ("ORF Name"). The second and third columns 

1 5 list the SEQ ID numbers for the nucleic acid ("NT ID") and amino acid ("AA ID") 

sequences corresponding to each ORF, respectively. The fourth and fifth columns list the 
length of the nucleic acid ORF ("NT Length") and the length of the amino acid ORF ("AA 
Length "), respectively. The nucleotide sequence corresponding to each ORF begins at the 
first nucleotide immediately following a stop codon and ends at the nucleotide immediately 

20 preceding the next downstream stop codon in the same reading frame. It will be recognized 
by one skilled in the art that the natural translation initiation sites will correspond to ATG, 
GTG, or TTG codons located within the ORFs. The natural initiation sites depend not only 
on the sequence of a start codon but also on the context of the DNA sequence adjacent to the 
start codon. Usually, a recognizable ribosome binding site is found within 20 nucleotides 

25 upstream from the initiation codon. In some cases where genes are translationally coupled 
and coordinately expressed together in "operons", ribosome binding sites are not present, but 
the initiation codon of a downstream gene may occur very close to, or overlap, the stop 
codon of the an upstream gene in the same operon. The correct start codons can be generally 
identified without undue experimentation because only a few codons need bq*tested. It is 

30 recognized that the translational machinery in bacteria initiates all polypeptide chains with 
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the amino acid methionine, regardless of the sequence of the start codon. In some cases, 
polypeptides are post-translationally modified, resulting in an N-terminal amino acid other 
than methionine in vivo. The sixth and seventh columns provide metrics for assessing the 
likelihood of the homology match (determined by the BLASTP2 algorithm), as is known in 
5 the art, to the genes indicated in the description frame ("Description") defined further below. 
These genes in the Description were identified when the designated ORF was compared 
against a comprehensive non-redundant protein database. Specifically, the sixth column 
represents the Blast Score ("Score") for the match (a higher score is a better match), and the 
seventh column represents the probability ("Probability") for the match (the probability that 
1 0 such a match can have occurred by chance; the lower the value, the more likely the match is 
valid). If a BLASTP2 score of less than 100 was obtained, no value is reported in the table. 
The remaining fields below the columns contain additional information relating to the 
potential fiinction of the sequence based on the BLASTP2 analysis. Where a match was 
discovered, the field "Protein name" list the protein's name identified from the match. In 
1 5 addition, one skilled in the art would be able to identify the match and elucidate its function 
using the "Locus name" and where available the accession number, "Acc#" from the 
database, Lastly, one skilled in the art would appreciate the "Description" field to further 
describe the potential function of the protein based on this analysis. This information allows 
one of ordinary skill in the art to determine a potential use for each identified coding 
20 sequence and, as a result, allows to use the polypeptides of the present invention for 
commercial and industrial purposes. 

Using the information provided in SEQ ID NO: 1 - SEQ ID NO: 1920, SEQ ID NO: 
1921 - SEQ ID NO: 3840 and in Table 2 together with routine cloning and sequencing 
methods, one of ordinary skill in the art will be able to clone and sequence all the nucleic 
25 acid fragments of interest including open reading frames (ORFs) encoding a large variety of 
proteins of M. catarrhalis. 

Nucleic acid isolated or synthesized in accordance with the sequences described 
herein have utility to generate polypeptides. The nucleic acid of the invention exemplified in 
SEQ ID NO: 1 - SEQ ID NO: 1920 and in Table 2 or fragments of said nucleic acid 
30 encoding active portions of M. catarrhalis polypeptides can be cloned into suitable vectors 
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or used to isolate nucleic acid. The isolated nucleic acid is combined with suitable DNA 
linkers and cloned into a suitable vector. 

The function of a specific gene or operon can be ascertained by expression in a 
bacterial strain under conditions where the activity of the gene product(s) specified by the 
5 gene or operon in question can be specifically measured. Alternatively, a gene product may 
be produced in large quantities in an expressing strain for use as an antigen, an industrial 
reagent, for structural studies, etc. This expression can be accomplished in a mutant strain 
which lacks the activity of the gene to be tested, or in a strain that does not produce the same 
gene produces). This includes, but is not limited to, Eucaryotic species such as the yeast 
1 0 Saccharomyces cerevisiae, Methanobacterium strains or other Archaea, and Eubacteria such 
as E. coli, B. Subtilis, S. Aureus, S. Pneumonia or Pseudomonas putida. In some cases the 
expression host will utilize the natural M. catarrhalis promoter whereas in others, it will be 
necessary to drive the gene with a promoter sequence derived from the expressing organism 
(e.g., an E. coli beta-galactosidase promoter for expression in E. coli). 
1 5 To express a gene product using the natural M. catarrhalis promoter, a procedure 

such as the following can be used. A restriction fragment containing the gene of interest, 
together with its associated natural promoter element and regulatory sequences (identified 
using the DNA sequence data) is cloned into an appropriate recombinant plasmid containing 
an origin of replication that functions in the host organism and an appropriate selectable 
20 marker. This can be accomplished by a number of procedures known to those skilled in the 
art. It is most preferably done by cutting the plasmid and the fragment to be cloned with the 
same restriction enzyme to produce compatible ends that can be ligated to join the two 
pieces together. The recombinant plasmid is introduced into the host organism by, for 
example, electroporation and cells containing the recombinant plasmid are identified by 
25 selection for the marker on the plasmid. Expression of the desired gene product is detected 
using an assay specific for that gene product. 

In the case of a gene that requires a different promoter, the body of the gene (coding 
sequence) is specifically excised and cloned into an appropriate expression plasmid. This 
subcloning can be done by several methods, but is most easily accomplished,by PCR 
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amplification of a specific fragment and ligation into an expression plasmid after treating the 
PCR product with a restriction enzyme or exonuclease to create suitable ends for cloning. 

A suitable host cell for expression of a gene can be any procaryotic or eucaryotic cell. 
Suitable methods for transforming host cells can be found in Sambrook et al. (Molecular 
5 Cloning: A Laboratory Manual. 2nd Edition, Cold Spring Harbor Laboratory Press (1 989)), 
and other laboratory textbooks. 

For example, a host cell transfected with a nucleic acid vector directing expression of 
a nucleotide sequence encoding an M. catarrhalis polypeptide can be cultured under 
appropriate conditions to allow expression of the polypeptide to occur. Suitable media for 
1 0 cell culture are well known in the art. Polypeptides of the invention can be isolated from cell 
culture medium, host cells, or both using techniques known in the art for purifying proteins 
including ion-exchange chromatography, gel filtration chromatography, ultrafiltration, 
electrophoresis, and immunoaffinity purification with antibodies specific for such 
polypeptides. Additionally, in many situations, polypeptides can be produced by chemical 
1 5 cleavage of a native protein (e.g., tryptic digestion) and the cleavage products can then be 
purified by standard techniques. 

In the case of membrane bound proteins, these can be isolated from a host cell by 
contacting a membrane-associated protein fraction with a detergent forming a solubilized 
complex, where the membrane-associated protein is no longer entirely embedded in the 
20 membrane fraction and is solubilized at least to an extent which allows it to be 

chromatographically isolated from the membrane fraction. Chromatographic techniques 
which can be used in the final purification step are known in the art and include hydrophobic 
interaction, lectin affinity, ion exchange, dye affinity and immunoaffinity. 

One strategy to maximize recombinant M. catarrhalis peptide expression in K coll is 
25 to express the protein in a host bacteria with an impaired capacity to proteolytically cleave 
the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in 
Enzymologyl85, Academic Press, San Diego, California (1990) 1 19-128). Another strategy 
would be to alter the nucleic acid encoding an M. catarrhalis peptide to be inserted into an 
expression vector so that the individual codons for each amino acid would be.those 
30 preferentially utilized in highly expressed E. coli proteins (Wada et al., (1992) Nuc. Acids 
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Res. 20:21 1 1-2118). Such alteration of nucleic acids of the invention can be carried out by 
standard DNA synthesis techniques. 

The nucleic acids of the invention can also be chemically synthesized using standard 
techniques. Various methods of chemically synthesizing polydeoxynucleotides are known, 
5 including solid-phase synthesis which, like peptide synthesis, has been fully automated in 
commercially available DNA synthesizers (See, e.g., Itakura et al. U.S. Patent No. 
4,598,049; Caruthers et al. U.S. Patent No. 4,458,066; and Itakura U.S. Patent Nos. 
4,401,796 and 4,373,071, incorporated by reference herein). 

The present invention provides a library of M. catarrhalis -derived nucleic acid 

1 0 sequences. The libraries provide probes, primers, and markers which can be used as markers 
in epidemiological studies. The present invention also provides a library of M. catarrhalis - 
derived nucleic acid sequences which comprise or encode targets for therapeutic drugs. 

Nucleic acids comprising any of the sequences disclosed herein or sub-sequences 
thereof can be prepared by standard methods using the nucleic acid sequence information 

15 provided in SEQ ID NO: 1 - SEQ ID NO: 1920. For example, DNA can be chemically 

synthesized using, e.g., the phosphoramidite solid support method of Matteucci et al, 1981, 
J. Am. Chem. Soc. 103:3185, the method of Yoo et al, 1989,J. Biol Chem. 764:17078, or 
other well known methods. This can be done by sequentially linking a series of 
oligonucleotide cassettes comprising pairs of synthetic oligonucleotides, as described below. 

20 Of course, due to the degeneracy of the genetic code, many different nucleotide 

sequences can encode polypeptides having the amino acid sequences defined by SEQ ID 
NO: 1921 - SEQ ID NO: 3840 or sub-sequences thereof. The codons can be selected for 
optimal expression in prokaryotic or eukaryotic systems. Such degenerate variants are also 
encompassed by this invention. 

25 Insertion of nucleic acids (typically DNAs) encoding the polypeptides of the 

invention into a vector is easily accomplished when the termini of both the DNAs and the 
vector comprise compatible restriction sites. If this cannot be done, it may be necessary to 
modify the termini of the DNAs and/or vector by digesting back single-stranded DNA 
overhangs generated by restriction endonuclease cleavage to produce blunt eijids, or to 
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achieve the same result by filling in the single-stranded termini with an appropriate DNA 
polymerase. 

Alternatively, any site desired may be produced, e.g., by ligating nucleotide 
sequences (linkers) onto the termini. Such linkers may comprise specific oligonucleotide 
5 sequences that define desired restriction sites. Restriction sites can also be generated by the 
use of the polymerase chain reaction (PCR). See, e.g., Saiki et ah, 1988, Science 239:48. 
The cleaved vector and the DNA fragments may also be modified if required by 
homopolymeric tailing. •• 

The nucleic acids of the invention may be isolated directly from cells. Alternatively, 
1 0 the polymerase chain reaction (PCR) method can be used to produce the nucleic acids of the 
invention, using either chemically synthesized strands or genomic material as templates. 
Primers used for PCR can be synthesized using the sequence information provided herein 
and can further be designed to introduce appropriate new restriction sites, if desirable, to 
facilitate incorporation into a given vector for recombinant expression. 

The nucleic acids of the present invention may be flanked by natural M. catarrhalis 
regulatory sequences, or may be associated with heterologous sequences, including 
promoters, enhancers, response elements, signal sequences, polyadenylation sequences, 
introns, 5'- and 3'- noncoding regions, and the like. The nucleic acids may also be modified 
by many means known in the art. Non-limiting examples of such modifications include 
20 methylation, "caps", substitution of one or more of the naturally occurring nucleotides with 
an analog, internucleotide modifications such as, for example, those with uncharged linkages 
(e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, carbamates, etc.) and with 
charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.). Nucleic acids may 
contain one or more additional covalently linked moieties, such as, for example, proteins 
(e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalators (e.g., 
acridine, psoralen, etc.), chelators (e.g., metals, radioactive metals, iron, oxidative metals, 
etc.), and alkylators. PNAs are also included. The nucleic acid may be derivatized by 
formation of a methyl or ethyl phosphotriester or an alkyl phosphoramidate linkage. 
Furthermore, the nucleic acid sequences of the present invention may also be, modified with 
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a label capable of providing a detectable signal, either directly or indirectly. Exemplary 
labels include radioisotopes, fluorescent molecules, biotin, and the like. 

The invention also, provides nucleic acid vectors comprising the disclosed M. 
catarrhalis -derived sequences or derivatives or fragments thereof. A large number of 
5 vectors, including plasmid and bacterial vectors, have been described for replication and/or 
expression in a variety of eukaryotic and prokaryotic hosts, and may be used for cloning or 
protein expression. 

The encoded M. catarrhalis polypeptides may be expressed by using many known 
vectors, such as pUC plasmids, pET plasmids (Novagen, Inc., Madison, WI), or pRSET or 
1 0 pREP (Invitrogen, San Diego, CA), and many appropriate host cells, using methods 
disclosed or cited herein or otherwise known to those skilled in the relevant art. The 
particular choice of vector/host is not critical to the practice of the invention. 

Recombinant cloning vectors will often include one or more replication systems for 
cloning or expression, one or more markers for selection in the host, e.g. antibiotic 
1 5 resistance, and one or more expression cassettes. The inserted M. catarrhalis coding 
sequences may be synthesized by standard methods, isolated from natural sources, or 
prepared as hybrids, etc. Ligation of the M. catarrhalis coding sequences to transcriptional 
regulatory elements and/or to other amino acid coding sequences may be achieved by known 
methods. Suitable host cells may be transformed/transfected/infected as appropriate by any 
20 suitable method including electroporation, CaCl 2 mediated DNA uptake, bacterial infection, 
microinjection, microprojectile, or other established methods. 

Appropriate host cells include bacteria, archebacteria, fungi, especially yeast, and 
plant and animal cells, especially mammalian cells. Of particular interest are M. catarrhalis 
, E. coli, B. Subtilis, Saccharomyces cerevisiae, Saccharomyces carlsbergensis, 
25 Schizosaccharomyces pombi, SF9 cells, C129 cells, 293 cells, Neurospora, and CHO cells, 
COS cells, HeLa cells, and immortalized mammalian myeloid and lymphoid cell lines. 
Preferred replication systems include M13, ColEl, SV40, baculovirus, lambda, adenovirus, 
and the like. A large number of transcription initiation and termination regulatory regions 
have been isolated and shown to be effective in the transcription and translation of 
30 heterologous proteins in the various hosts. Examples of these regions, methods of isolation, 
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manner of manipulation, etc. are known in the art. Under appropriate expression conditions, 
host cells can be used as a source of recombinantly produced M. catarrhalis -derived 
peptides and polypeptides. 

Advantageously, vectors may also include a transcription regulatory element (i.e., a 
5 promoter) operably linked to the M. catarrhalis portion. The promoter may optionally 

contain operator portions and/or ribosome binding sites. Non-limiting examples of bacterial 
promoters compatible with E. coli include: b-lactamase (penicillinase) promoter; lactose 
promoter; tryptophan (tip) promoter; araBAD (arabinose) operon promoter; lambda-derived 
Pi promoter and N gene ribosome binding site; and the hybrid tac promoter derived from 
1 0 sequences of the trp and lac UV5 promoters. Non-limiting examples of yeast promoters 
include 3-phosphoglycerate kinase promoter, glyceraldehyde-3 -phosphate dehydrogenase 
(GAPDH) promoter, galactokinase (GAL1) promoter, galactoepimerase promoter, and 
alcohol dehydrogenase (ADH) promoter. Suitable promoters for mammalian cells include 
without limitation viral promoters such as that from Simian Virus 40 (SV40), Rous sarcoma 
1 5 virus (RS V), adenovirus (ADV), and bovine papilloma virus (BPV). Mammalian cells may 
also require terminator sequences, polyA addition sequences and enhancer sequences to 
increase expression. Sequences which cause amplification of the gene may also be desirable. 
Furthermore, sequences that facilitate secretion of the recombinant product from cells, 
including, but not limited to, bacteria, yeast, and animal cells, such as secretory signal 
20 sequences and/or prohormone pro region sequences, may also be included. These sequences 
are well described in the art. 

Nucleic acids encoding wild-type or variant M. catarrhalis -derived polypeptides 
may also be introduced into cells by recombination events. For example, such a sequence 
can be introduced into a cell, and thereby effect homologous recombination at the site of an 
25 endogenous gene or a sequence with substantial identity to the gene. Other recombination- 
based methods such as nonhomologous recombinations or deletion of endogenous genes by 
homologous recombination may also be used. 

The nucleic acids of the present invention find use as templates for the recombinant 
production of M. catarrhalis -derived peptides or polypeptides. 

30 
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IDENTIFICATION AND USE OF M. CA TARRHALIS NUCLEIC ACID. SEQUENCES 

The disclosed M catarrhalis polypeptide and nucleic acid sequences, or other 
sequences that are contained within ORFs, including complete protein-coding sequences, of 
which any of the disclosed M. catarrhalis -specific sequences forms a part, are useful as 
target components for diagnosis and/or treatment of M. catarrhalis - caused infection 

It will be understood that the sequence of an entire protein-coding sequence of which 
each disclosed nucleic acid sequence forms a part can be isolated and identified based on 
each disclosed sequence. This can be achieved, for example, by using an isolated nucleic 
acid encoding the disclosed sequence, or fragments thereof, to prime a sequencing reaction 
with genomic M. catarrhalis DMA as template; this is followed by sequencing the amplified 
product. The isolated nucleic acid encoding the disclosed sequence, or fragments thereof, 
can also be hybridized to M. catarrhalis genomic libraries to identify clones containing ' 
additional complete segments of the protein-coding sequence of which the shorter sequence 
forms a part. Then, the entire protein-coding sequence, or fragments thereof, or nucleic 
acids encoding all or part of the sequence, or sequence-conservative or function-conservative 
variants thereof, may be employed in practicing the present invention. 

Preferred sequences are those that are useful in diagnostic and/or therapeutic 
applications. Diagnostic applications include without limitation nucleic-acid-based and 
antibody-based methods for detecting bacterial infection. Therapeutic applications include 
without limitation vaccines, passive immunotherapy, and drug treatments directed against 
gene products that are both unique to bacteria and essential for growth and/or replication of 



bacteria 



IDENTIFICATION OF NUCLEIC ACIDS ENCODING VACCINE COMPONENTS AND 
25 TARGETS FOR AGENTS EFFECTIVE AGAINST M. CATARRHALIS 

The disclosed M. catarrhalis genome sequence includes segments that direct the 
synthesis of ribonucleic acids and polypeptides, as well as origins of replication, promoters, 
other types of regulatory sequences, and intergenic nucleic acids. The invention 
encompasses nucleic acids encoding immunogenic components of vaccines and targets for 
agents effective against M. catarrhalis . Identification of said immunogenic components 
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involved in the determination of the function of the disclosed sequences, which can be 

achieved using a variety of approaches. Non-limiting examples of these approaches are 
described briefly below. 

HOMOLOGY TO KNOWN SEQUENCES: 

Computer-assisted comparison of the disclosed M. catarrhalis sequences with 
previously reported sequences present in publicly available databases is useful for identifying 
functional M. catarrhalis nucleic acid and polypeptide sequences. It will be understood that 
protein-coding sequences, for example, may be compared as a whole, and that a high degree 
of sequence homology between two proteins (such as, for example, >80-90%) at the amino 
acid level indicates that the two proteins also possess some degree of functional homology, 
such as, for example, among enzymes involved in metabolism, DNA synthesis, or cell wall 
synthesis, and proteins involved in transport, cell division, etc. In addition, many structural 
features of particular protein classes have been identified and correlate with specific 
consensus sequences, such as, for example, binding domains for nucleotides, DNA, metal 
ions, and other small molecules; sites for covalent modifications such as phosphorylation, 
acylation, and the like; sites of proteimprotein interactions, etc. These consensus sequences 
may be quite short and thus may represent only a fraction of the entire protein-coding 
sequence. Identification of such a feature in an M. catarrhalis sequence is therefore useful in 
determining the function of the encoded protein and identifying useful targets of antibacterial 
drugs. 

Of particular relevance to the present invention are structural features that are 
common to secretory, transmembrane, and surface proteins, including secretion signal 
peptides and hydrophobic transmembrane domains. M. catarrhalis proteins identified as 
containing putative signal sequences and/or transmembrane domains are useful as 
immunogenic components of vaccines. 

Targets for therapeutic drugs according to the invention include, but are not limited 
to, polypeptides of the invention, whether unique to M. catarrhalis or not, that are essential 
for growth and/or viability of M. catarrhalis under at least one growth condition. 
Polypeptides essential for growth and/or viability can be determined by examining the effect 
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of deleting and/or disrupting the genes, i.e., by so-called gene "knockout". Alternatively, 
genetic footprinting can be used (Smith et al, 1995, Proc. Natl. Acad Set USA 92:5479- 
6433; Published International Application WO 94/26933; U.S. Patent No. 5,612,180). Still 
other methods for assessing essentiality includes the ability to isolate conditional lethal 
5 mutations in the specific gene (e.g., temperature sensitive mutations). Other useful targets 
for therapeutic drugs, which include polypeptides that are not essential for growth or 
viability per se but lead to loss of viability of the cell, can be used to target therapeutic 
agents to cells. K 

1 0 STRAIN-SPECIFIC SEQUENCES: 

Because of the evolutionary relationship between different M. catarrhalis strains, it is 
believed that the presently disclosed M. catarrhalis sequences are useful for identifying, 
and/or discriminating between, previously known and new M. catarrhalis strains. It is 
believed that other M. catarrhalis strains will exhibit at least about 70% sequence homology 

1 5 with the presently disclosed sequence. Systematic and routine analyses of DNA sequences 
derived from samples containing M. catarrhalis strains, and comparison with the present 
sequence allows for the identification of sequences that can be used to discriminate between 
strains, as well as those that are common to all M catarrhalis strains. In one embodiment, 
the invention provides nucleic acids, including probes, and peptide and polypeptide 

20 sequences that discriminate between different strains of M. catarrhalis . Strain-specific 
components can also be identified functionally by their ability to elicit or react with 
antibodies that selectively recognize one or more M. catarrhalis strains. 

In another embodiment, the invention provides nucleic acids, including probes, and 
peptide and polypeptide sequences that are common to all M. catarrhalis strains but are not 

25 found in other bacterial species. 

M CATARRHALIS POLYPEPTIDES 

This invention encompasses isolated M catarrhalis polypeptides encoded by the 
disclosed M. catarrhalis genomic sequences, including the polypeptides of the invention 
30 contained in the Sequence Listing. Polypeptides of the invention are preferably at least 
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about 5 amino acid residues in length. Using the DNA sequence information provided 
herein, the amino acid sequences of the polypeptides encompassed by the invention can be 
deduced using methods well-known in the art. It will be understood that the sequence of an 
entire nucleic acid encoding an M. catarrhalis polypeptide can be isolated and identified 
based on an ORF that encodes only a fragment of the cognate protein-coding region. This 
can be achieved, for example, by using the isolated nucleic acid encoding the ORF, or 
fragments thereof, to prime a polymerase chain reaction with genomic M. catarrhalis DNA 
as template; this is followed by sequencing the amplified product. 

The polypeptides of the present invention, including function-conservative variants 
of the disclosed ORFs, may be isolated from wild-type or mutant M. catarrhalis cells, or 
from heterologous organisms or cells (including, but not limited to, bacteria, fungi, insect, 
plant, and mammalian cells) including M. catarrhalis into which an M. catarrhalis -derived 
protein-coding sequence has been introduced and expressed. Furthermore, the polypeptides 
may be part of recombinant fusion proteins. 

M. catarrhalis polypeptides of the invention can be chemically synthesized using 
commercially automated procedures such as those referenced herein , including, without 
limitation, exclusive solid phase synthesis, partial solid phase methods, fragment 
condensation or classical solution synthesis. The polypeptides are preferably prepared by 
solid phase peptide synthesis as described by Merrifield, 1963, J. Am. Chem. Soc. 85:2149. 
The synthesis is carried out with amino acids that are protected at the alpha-amino terminus. 
Trifunctional amino acids with labile side-chains are also protected with suitable groups to 
prevent undesired chemical reactions from occurring during the assembly of the 
polypeptides. The alpha-amino protecting group is selectively removed to allow subsequent 
reaction to take place at the amino-terminus. The conditions for the removal of the alpha- 
amino protecting group do not remove the side-chain protecting groups. 

Methods for polypeptide purification are well-known in the art, including, without 
limitation, preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase 
HPLC, gel filtration, ion exchange and partition chromatography, and countercurrent 
distribution. For some purposes, it is preferable to produce the polypeptide in-a recombinant 
system in which the M. catarrhalis protein contains an additional sequence tag that 
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facilitates purification, such as, but not limited to, a polyhistidine sequence. The polypeptide 
can then be purified from a crude lysate of the host cell by chromatography on an appropriate 
solid-phase matrix. Alternatively, antibodies produced against an M. catarrhalis protein or 
against peptides derived therefrom can be used as purification reagents. Other purification 
5 methods are possible. 

The present invention also encompasses derivatives and homologues of M. 
catarrhalis -encoded polypeptides. For some purposes, nucleic acid sequences encoding the 
peptides may be altered by substitutions, additions, or deletions that provide for functionally 
equivalent molecules, i.e., function-conservative variants. For example, one or more amino 

1 0 acid residues within the sequence can be substituted by another amino acid of similar 
properties, such as, for example, positively charged amino acids (arginine, lysine, and 
histidine); negatively charged amino acids (aspartate and glutamate); polar neutral amino 
acids; and non-polar amino acids. 

The isolated polypeptides may be modified by, for example, phosphorylation, 

1 5 sulfation, acylation, or other protein modifications. They may also be modified with a label 
capable of providing a detectable signal, either directly or indirectly, including, but not 
limited to, radioisotopes and fluorescent compounds. 

To identify M. catarrhalis -derived polypeptides for use in the present invention, 
essentially the complete genomic sequence of a virulent, methicillin-resistant isolate of M. 

20 catarrhalis isolate was analyzed. While, in very rare instances, a nucleic acid sequencing 
error may be revealed, resolving a rare sequencing error is well within the art, and such an 
occurrence will not prevent one skilled in the art from practicing the invention. 

Also encompassed are any M. catarrhalis polypeptide sequences that are contained 
within the open reading frames (ORFs), including complete protein-coding sequences, of 

25 which any of SEQ ID NO: 1 - SEQ ID NO: 1920 forms a part. Table 2, which is appended 
herewith and which forms part of the present specification, provides a putative identification 
of the particular function of a polypeptide which is encoded by each ORF, based on the 
homology match (determined by the BLAST algorithm) of the predicted polypeptide with 
known proteins encoded by ORFs in other organisms. As a result, one skille^in the art can 
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use the polypeptides of the present invention for commercial and industrial purposes 
consistent with the type of putative identification of the polypeptide. 

The present invention provides a library of M. catarrhalis -derived polypeptide 
sequences, and a corresponding library of nucleic acid sequences encoding the polypeptides, 
5 wherein the polypeptides themselves, or polypeptides contained within ORFs of which they 
form a part, comprise sequences that are contemplated for use as components of vaccines. 
Non-limiting examples of such sequences are listed by SEQ ID NO in Table 2, which is 
appended herewith and which forms part of the present specification. *■ 

The present invention also provides a library of M catarrhalis -derived polypeptide 

1 0 sequences, and a corresponding library of nucleic acid sequences encoding the polypeptides, 
wherein the polypeptides themselves, or polypeptides contained within ORFs of which they 
form a part, comprise sequences lacking homology to any known prokaryotic or eukaryotic 
sequences. Such libraries provide probes, primers, and markers which can be used to 
diagnose M. catarrhalis infection, including use as markers in epidemiological studies. 

15 Non-limiting examples of such sequences are listed by SEQ ID NO in Table 2, which is 
appended hereto and part hereof. 

The present invention also provides a library of M catarrhalis -derived polypeptide . 
sequences, and a corresponding library of nucleic acid sequences encoding the polypeptides, 
wherein the polypeptides themselves, or polypeptides contained within ORFs of which they 

20 form a part, comprise targets for therapeutic drugs. 

SPECIFIC EXAMPLE: DETERMINATION OF MORAXELLA PROTEIN ANTIGENS 
FOR ANTIBODY AND VACCINE DEVELOPMENT 

The. selection of Moraxella protein antigens for vaccine development can be derived 
25 from the nucleic acids encoding M. catarrhalis polypeptides. First, the ORF's can be 

analyzed for homology to other known exported or membrane proteins and analyzed using 
the discriminant analysis described by Klein, et al. (Klein, P., Kanehsia, M., and DeLisi, C. 
(1985) Biochimica et Biophysica Acta 815, 468-476) for predicting exported and membrane 
proteins. 
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Homology searches can be performed using the BLAST algorithm contained in the 
Wisconsin Sequence Analysis Package (Genetics Computer Group, University Research 
Park, 575 Science Drive, Madison, WI 5371 1) to compare each predicted ORF amino acid 
sequence with all sequences found in the current GenBank, SWISS-PROT and PIR 
5 databases. BLAST searches for local alignments between the ORF and the databank 
sequences and reports a probability score which indicates the probability of finding this 
sequence by chance in the database. ORFs with significant homology (e.g. probabilities 
-6 <■ 

lower than 1x10 that the homology is only due to random chance) to membrane or 

exported proteins represent protein antigens for vaccine development. Possible functions 
10 can be provided to M. catarrhalis genes based on sequence homology to genes cloned in 
other organisms. 

Discriminant analysis (Klein, et al. supra) can be used to examine the ORF amino 
acid sequences. This algorithm uses the intrinsic information contained in the ORF amino 
acid sequence and compares it to information derived from the properties of known 
1 5 membrane and exported proteins. This comparison predicts which proteins will be exported, 
membrane associated or cytoplasmic. ORF amino acid sequences identified as exported or 
membrane associated by this algorithm are likely protein antigens for vaccine development. 

PRODUCTION OF FRAGMENTS AND ANALOGS OF M. CA TARRHALIS NUCLEIC 

20 ACIDS AND POLYPEPTIDES 

Based on the discovery of the M. catarrhalis gene products of the invention provided 
in the Sequence Listing, one skilled in the art can alter the disclosed structure of M. 
catarrhalis genes, e.g., by producing fragments or analogs, and test the newly produced 
structures for activity. Examples of techniques known to those skilled in the relevant art 

25 which allow the production and testing of fragments and analogs are discussed below. 

These, or analogous methods can be used to make and screen libraries of polypeptides, e.g., 
libraries of random peptides or libraries of fragments or analogs of cellular proteins for the 
ability to bind M. catarrhalis polypeptides. Such screens are useful for the identification of 
inhibitors of M. catarrhalis . 
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GENERATION OF FRAGMENTS 

Fragments of a protein can be produced in several ways, e.g., recombinantly, by 
proteolytic digestion, or by chemical synthesis. Internal or terminal fragments of a 
5 polypeptide can be generated by removing one or more nucleotides from one end (for a 
terminal fragment) or both ends (for an internal fragment) of a nucleic acid which encodes 
the polypeptide. Expression of the mutagenized DNA produces polypeptide fragments. 
Digestion with "end-nibbling" endonucleases can thus generate DNAs which encode an array 
of fragments. DNAs which encode fragments of a protein can also be generated by random 
1 0 shearing, restriction digestion or a combination of the above-discussed methods. 

Fragments can also be chemically synthesized using techniques known in the art such 
as conventional Merrifield solid phase f-Moc or t-Boc chemistry. For example, peptides of 
the present invention may be arbitrarily divided into fragments of desired length with no 
overlap of the fragments, or divided into overlapping fragments of a desired length. 

15 

ALTERATION OF NUCLEIC ACIDS AND POLYPEPTIDES: RANDOM METHODS 

Amino acid sequence variants of a protein can be prepared by random mutagenesis of 
DNA which encodes a protein or a particular domain or region of a protein. Useful methods 
include PCR mutagenesis and saturation mutagenesis. A library of random amino acid 
20 sequence variants can also be generated by the synthesis of a set of degenerate 

oligonucleotide sequences. (Methods for screening proteins in a library of variants are 
elsewhere herein). 

PCR MUTAGENESIS 

25 In PCR mutagenesis, reduced Taq polymerase fidelity is used to introduce random 

mutations into a cloned fragment of DNA (Leung et al., 1989, Technique 1:11-15). The 
DNA region to be mutagenized is amplified using the polymerase chain reaction (PCR) 
under conditions that reduce the fidelity of DNA synthesis by Taq DNA polymerase, e.g., by 

2+ 

using a dGTP/dATP ratio of five and adding Mn to the PCR reaction. The pool of 
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amplified DNA fragments are inserted into appropriate cloning vectors to provide random 
mutant libraries. 

SATURATION MUTAGENESIS 
5 Saturation mutagenesis allows for the rapid introduction of a large number of single 

base substitutions into cloned DNA fragments (Mayers et al. 5 1985, Science 229:242). This 
technique includes generation of mutations, e.g., by chemical treatment or irradiation of 
single-stranded DNA in vitro, and synthesis of a complimentary DNA strand. The mutation 
frequency can be modulated by modulating the severity of the treatment, and essentially all 
1 0 possible base substitutions can be obtained. Because this procedure does not involve a 

genetic selection for mutant fragments both neutral substitutions, as well as those that alter 
function, are obtained. The distribution of point mutations is not biased toward conserved 
sequence elements. 

1 5 DEGENERATE OLIGONUCLEOTIDES 

A library of homologs can also be generated from a set of degenerate oligonucleotide 
sequences. Chemical synthesis of a degenerate sequences can be carried out in an automatic 
DNA synthesizer, and the synthetic genes then ligated into an appropriate expression vector. 
The synthesis of degenerate oligonucleotides is known in the art (see for example, Narang, 

20 SA (1983) Tetrahedron 39:3; Itakura et al. (1981) Recombinant DNA, Proc 3rd Cleveland 
Sympos. Macromolecules, ed. AG Walton, Amsterdam: Elsevier pp273-289; Itakura et al. 
(1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et al. (1983) 
Nucleic Acid Res. 1 1 :477. Such techniques have been employed in the directed evolution of 
other proteins (see, for example, Scott et al. (1990) Science 249:386-390; Roberts et al. 

25 (1992) PNAS 89:2429-2433; Devlin et al. (1990) Science 249: 404-406; Cwirla et al. (1990) 
PNAS 87: 6378-6382; as well as U.S. Patents Nos. 5,223,409, 5,198,346, and 5,096,815). 



30 
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ALTERATION OF NUCLEIC ACIDS AND POLYPEPTIDES: METHODS FOR 
DIRECTED MUTAGENESIS 

Non-random or directed, mutagenesis techniques can be used to provide specific 
sequences or mutations in specific regions. These techniques can be used to create variants 
which include, e.g., deletions, insertions, or substitutions, of residues of the known amino 
acid sequence of a protein. The sites for mutation can be modified individually or in series, 
e.g., by (1) substituting first with conserved amino acids and then with more radical choices 
depending upon results achieved, (2) deleting the target residue, or (3) inserting residues of 
the same or a different class adjacent to the located site, or combinations of options 1-3. 



ALANINE SCANNING MUTAGENESIS 

Alanine scanning mutagenesis is a useful method for identification of certain residues 
or regions of the desired protein that are preferred locations or domains for mutagenesis, 
Cunningham and Wells {Science 244:1081-1085, 1989). In alanine scanning, a residue or 
group of target residues are identified (e.g., charged residues such as Arg, Asp, His, Lys, and 
Glu) and replaced by a neutral or negatively charged amino acid (most preferably alanine or 
polyalanine). Replacement of an amino acid can affect the interaction of the amino acids 
with the surrounding aqueous environment in or outside the cell. Those domains 
demonstrating functional sensitivity to the substitutions are then refined by introducing 
20 further or other variants at or for the sites of substitution. Thus, while the site for 

introducing an amino acid sequence variation is predetermined, the nature of the mutation 
per se need not be predetermined. For example, to optimize the performance of a mutation 
at a given site, alanine scanning or random mutagenesis may be conducted at the target 
codon or region and the expressed desired protein subunit variants are screened for the 
25 optimal combination of desired activity. 

OLIGONUCLEOTIDE-MEDIATED MUTAGENESIS 

Oligonucleotide-mediated mutagenesis is a useful method for preparing substitution, 
deletion, and insertion variants of DNA, see, e.g., Adelman et al., {DNA 2: 1 83-, 1 983). 
30 Briefly, the desired DNA is altered by hybridizing an oligonucleotide encoding a mutation to 
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a DNA template, where the template is the single-stranded form of a plasmid or 
bacteriophage containing the unaltered or native DNA sequence of the desired protein. After 
hybridization, a DNA polymerase is used to synthesize an entire second complementary 
strand of the template that will thus incorporate the oligonucleotide primer, and will code for 
5 the selected alteration in the desired protein DNA. Generally, oligonucleotides of at least 
about 25 nucleotides in length are used. An optimal oligonucleotide will have 12 to 15 
nucleotides that are completely complementary to the template on either side of the 
nucleotide(s) coding for the mutation. This ensures that the oligonucleotide will hybridize 
properly to the single-stranded DNA template molecule. The oligonucleotides are readily 
1 0 synthesized using techniques known in the art such as that described by Crea et al. (Proc. 
Natl. Acad. Sci. USA, 75: 5765[1978]). 

- . CASSETTE MUTAGENESIS 

Another method for preparing variants, cassette mutagenesis, is based on the 
technique described by Wells et al. (Gene, 34:3 15[1985]). The starting material is aplasmid 
(or other vector) which includes the protein subunit DNA to be mutated. The codon(s) in the 
protein subunit DNA to be mutated are identified. There must be a unique restriction 
endonuclease site on each side of the identified mutation site(s). If no such restriction sites 
exist, they may be generated using the above-described oligonucleotide-mediated 
mutagenesis method to introduce them at appropriate locations in the desired protein subunit 
DNA. After the restriction sites have been introduced into the plasmid, the plasmid is cut at 
these sites to linearize it. A double-stranded oligonucleotide encoding the sequence of the 
DNA between the restriction sites but containing the desired mutation(s) is synthesized using 
standard procedures. The two strands are synthesized separately and then hybridized 
25 together using standard techniques. This double-stranded oligonucleotide is referred to as 
the cassette. This cassette is designed to have 3* and 5' ends that are comparable with the 
ends of the linearized plasmid, such that it can be directly ligated to the plasmid. This 
plasmid now contains the mutated desired protein subunit DNA sequence. 

30 



20 
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COMBINATORIAL MUTAGENESIS 

Combinatorial mutagenesis can also be used to generate mutants (Ladner et al., WO 
88/06630). In this method, the amino acid sequences for a group of homologs or other 
related proteins are aligned, preferably to promote the highest homology possible. All of the 
5 amino acids which appear at a given position of the aligned sequences can be selected to 
create a degenerate set of combinatorial sequences. The variegated library of variants is 
generated by combinatorial mutagenesis at the nucleic acid level, and is encoded by a 
variegated gene library. For example, a mixture of synthetic oligonucleotides can be 
enzymatically ligated into gene sequences such that the degenerate set of potential sequences 
1 0 are expressible as individual peptides, or alternatively, as a set of larger fusion proteins 
containing the set of degenerate sequences. 

OTHER MODIFICATIONS OF M. CA TARRHALIS NUCLEIC ACIDS AND 
POLYPEPTIDES 

15 It is possible to modify the structure of an M. catarrhalis polypeptide for such 

purposes as increasing solubility, enhancing stability (e.g., shelf life ex vivo and resistance to 
proteolytic degradation in vivo). A modified M. catarrhalis protein or peptide can be 
produced in which the amino acid sequence has been altered, such as by amino acid 
substitution, deletion, or addition as described herein. 

20 An M. catarrhalis peptide can also be modified by substitution of cysteine residues 

preferably with alanine, serine, threonine, leucine or glutamic acid residues to minimize 
dimerization via disulfide linkages. In addition, amino acid side chains of fragments of the 
protein of the invention can be chemically modified. Another modification is cyclization of 
the peptide. 

25 In order to enhance stability and/or reactivity, an M. catarrhalis polypeptide can be 

modified to incorporate one or more polymorphisms in the amino acid sequence of the 
protein resulting from any natural allelic variation. Additionally, D-amino acids, non-natural 
amino acids, or non-amino acid analogs can be substituted or added to produce a modified 
protein within the scope of this invention. Furthermore, an M. catarrhalis polypeptide can 

30 be modified using polyethylene glycol (PEG) according to the method of A. Sehon and co- 
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workers (Wie et al., supra) to produce a protein conjugated with PEG. In addition, PEG can 
be added during chemical synthesis of the protein. Other modifications of M. catarrhalis 
proteins include reduction/alkylation (Tarr, Methods of Protein Microcharacterization, J. E. 
Silver ed. 5 Humana Press, Clifton NJ 155-194 (1986)); acylation (Tarr, supra); chemical 
5 coupling to an appropriate carrier (Mishell and Shiigi, eds, Selected Methods in Cellular 
Immunology, WH Freeman, San Francisco, CA (1980), U.S. Patent 4,939,239; or mild 
formalin treatment (Marsh, (1971) Int. Arch, of Allergy and Appl Immunol, 41: 199-215). 

«• To facilitate purification and potentially increase solubility of an M. catarrhalis 
protein or peptide, it is possible to add an amino acid fusion moiety to the peptide backbone. 
1 0 For example, hexa-histidine can be added to the protein for purification by immobilized 
metal ion affinity chromatography (Hochuli, E. et al., (1988) Bio/Technology, 6: 1321 - 
1325). In addition, to facilitate isolation of peptides free of irrelevant sequences, specific 
endoprotease cleavage sites can be introduced between the sequences of the fusion moiety 
and the peptide. 

15 To potentially aid proper antigen processing of epitopes within an M. catarrhalis 

polypeptide, canonical protease sensitive sites can be engineered between regions, each 
comprising at least one epitope via recombinant or synthetic methods. For example, charged 
amino acid pairs, such as KK or RR, can be introduced between regions within a protein or 
fragment during recombinant construction thereof. The resulting peptide can be rendered 

20 sensitive to cleavage by cathepsin and/or other trypsin-like enzymes which would generate 
portions of the protein containing one or more epitopes. In addition, such charged amino 
acid residues can result in an increase in the solubility of the peptide. 

PRIMARY METHODS FOR SCREENING POLYPEPTIDES AND ANALOGS 
25 Various techniques are known in the art for screening generated mutant gene 

products. Techniques for screening large gene libraries often include cloning the gene 
library into replicable expression vectors, transforming appropriate cells with the resulting 
library of vectors, and expressing the genes under conditions in which detection of a desired 
activity, e.g., in this case, binding to M. catarrhalis polypeptide or an interacting protein, 
30 facilitates relatively easy isolation of the vector encoding the gene whose product was 
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detected. Each of the techniques described below is amenable to high through-put analysis 
for screening large numbers of sequences created, e.g., by random mutagenesis techniques. 

TWO HYBRID SYSTEMS 
5 Two hybrid assays such as the system described below (as with the other screening 

methods described herein), can be used to identify polypeptides, e.g., fragments or analogs of 
a naturally-occurring M. catarrhalis polypeptide, e.g., of cellular proteins, or of randomly 
generated polypeptides which bind to an M catarrhalis protein. (The M. catarrhalis 
domain is used as the bait protein and the library of variants are expressed as prey fusion 
1 0 proteins.) In an analogous fashion, a two hybrid assay (as with the other screening methods 
described herein), can be used to find polypeptides which bind an M. catarrhalis 
polypeptide. 

DISPLAY LIBRARIES 

1 5 In one approach to screening assays, the Moraxella peptides are displayed on the 

surface of a cell or viral particle, and the ability of particular cells or viral particles to bind an 
appropriate receptor protein via the displayed product is detected in a "panning assay". For 
example, the gene library can be cloned into the gene for a surface membrane protein of a 
bacterial cell, and the resulting fusion protein detected by panning (Ladner et al., WO 

20 88/06630; Fuchs et al. (1991) Bio/Technology 9:1370-1371; and Goward et al. (1992) TIBS 
1 8: 1 36-140). In a similar fashion, a detectably labeled ligand can be used to score for 
potentially functional peptide homologs. Fluorescently labeled ligands, e.g., receptors, can 
be used to detect homologs which retain ligand-binding activity. The use of fluorescently 
labeled ligands, allows cells to be visually inspected and separated under a fluorescence 

25 microscope, or, where the morphology of the cell permits, to be separated by a fluorescence- 
activated cell sorter. 

A gene library can be expressed as a fusion protein on the surface of a viral particle. 
For instance, in the filamentous phage system, foreign peptide sequences can be expressed 
on the surface of infectious phage, thereby conferring two significant benefit^ First, since 
30 these phage can be applied to affinity matrices at concentrations well over 10 13 phage per 
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milliliter, a large number of phage can be screened at one time. Second, since each 
infectious phage displays a gene product on its surface, if a particular phage is recovered 
from an affinity matrix in low yield, the phage can be amplified by another round of 
infection. The group of almost identical E. coli filamentous phages, M13, fd., and fl, are 
5 most often used in phage display libraries. Either of the phage gin or gVIH coat proteins can 
be used to generate fusion proteins without disrupting the ultimate packaging of the viral 
particle. Foreign epitopes can be expressed at the NH 2 -terminal end of pin and phage 
bearing such epitopes recovered from a large excess of phage lacking this epitope (Ladner et 
al. PCT publication WO 90/02909; Garrard et al., PCT publication WO 92/09690; Marks et 

10 al. (1992)J. Biol Chem. 267:16007-16010; Griffiths etal. (1993) EMBOJ 12:725-734; 
Clackson et al. (1991) Nature 352:624-628; and Barbas et al. (1992) PNAS 89:4457-4461). 

A common approach uses the maltose receptor of E. coli (the outer membrane 
protein, LamB) as a peptide fusion partner (Charbit et al. (1 986) EMBO 5, 3029-3037). . 
Oligonucleotides have been inserted into plasmids encoding the LamB gene to produce 

15 peptides fused into one of the extracellular loops of the protein. These peptides are available 
for binding to ligands, e.g., to antibodies, and can elicit an immune response when the cells 
are administered to animals. Other cell surface proteins, e.g., OmpA (Schorr et al. (1991) 
Vaccines 91, pp. 387-392), PhoE (Agterberg, et al. (1990) Gene 88, 37-45), and PAL (Fuchs 
et al. (1991) Bio/Tech 9, 1369-1372), as well as large bacterial surface structures have served 

20 as vehicles for peptide display. Peptides can be fused to pilin, a protein which polymerizes 
to form the pilus-a conduit for interbacterial exchange of genetic information (Thiry et al. 
(1989) Appl Environ. Microbiol 55, 984-993). Because of its role in interacting with other 
cells, the pilus provides a useful support for the presentation of peptides to the extracellular 
environment. Another large surface structure used for peptide display is the bacterial motive 

25 organ, the flagellum. Fusion of peptides to the subunit protein flagellin offers a dense array 
of many peptide copies on the host cells (Kuwajima et al. (1988) Bio/Tech. 6, 1080-1083). 
Surface proteins of other bacterial species have also served as peptide fusion partners. 
Examples include the Moraxella protein A and the outer membrane IgA protease of 
Neisseria (Hansson et al. (1992) J. Bacteriol 174, 4239-4245 and Klauser et,al. (1990) 

30 EMBOJ. 9, 1991-1999). 
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In the filamentous phage systems and the LamB system described above, the physical 
link between the peptide and its encoding DNA occurs by the containment of the DNA 
within a particle (cell or phage) that carries the peptide on its surface. Capturing the peptide 
captures the particle and the DNA within. An alternative scheme uses the DNA-binding 
5 protein Lad to form a link between peptide and DNA (Cull et al. (1 992) PNAS USA 
89: 1 865-1 869). This system uses a plasmid containing the Lad gene with an 
oligonucleotide cloning site at its 3 -end. Under the controlled induction by arabinose, a 
Lacl-peptide fusion protein is produced. This fusion retains the natural ability of Lad to 
bind to a short DNA sequence known as LacO operator (LacO). By installing two copies of 
1 0 LacO on the expression plasmid, the Lacl-peptide fusion binds tightly to the plasmid that 

encoded it. Because the plasmids in each cell contain only a single oligonucleotide sequence 
and each cell expresses only a single peptide sequence, the peptides become specifically and 
stablely associated with the DNA sequence that directed its synthesis. The cells of the 
library are gently lysed and the peptide-DNA complexes are exposed to a matrix of 
1 5 immobilized receptor to recover the complexes containing active peptides. The associated 
plasmid DNA is then reintroduced into cells for amplification and DNA sequencing to 
determine the identity of the peptide ligands. As a demonstration of the practical utility of 
the method, a large random library of dodecapeptides was made and selected on a 
monoclonal antibody raised against the opioid peptide dynorphin B. A cohort of peptides 
20 was recovered, all related a consensus sequence corresponding to a six-residue portion of 
dynorphin B. (Cull et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89-1869) 

This scheme, sometimes referred to as peptides-on-plasmids, differs in two important 
ways from the phage display methods. First, the peptides are attached to the C-terminus of 
the fusion protein, resulting in the display of the library members as peptides having free 
25 carboxy termini. Both of the filamentous phage coat proteins, pill and pVIII, are anchored to 
the phage through their C-termini, and the guest peptides are placed into the outward- 
extending N-terminal domains. In some designs, the phage-displayed peptides are presented 
right at the amino terminus of the fusion protein. (Cwirla, et al. (1990) Proc. Natl. Acad Sci. 
U.S.A. 87, 6378-6382) A second difference is the set of biological biases affecting the 
30 population of peptides actually present in the libraries. The LacI fusion molecules are 
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confined to the cytoplasm of the host cells. The phage coat fusions are exposed briefly to the 
cytoplasm during translation but are rapidly secreted through the inner membrane into the 
periplasmic compartment, remaining anchored in the membrane by their C-terminal 
hydrophobic domains, with the N-temiini, containing the peptides, protruding into the 
5 periplasm while awaiting assembly into phage particles. The peptides in the Lad and phage 
libraries may differ significantly as a result of their exposure to different proteolytic 
activities. The phage coat proteins require transport across the inner membrane and signal 
peptidase processing as a prelude to incorporation into phage. Certain peptides exert a 
deleterious effect on these processes and are underrepresented in the libraries (Gallop et al. 
10 (\994)J.Med Chem. 37(9):1233-1251). These particular biases are not a factor in the Lad 
display system. 

The number of small peptides available in recombinant random libraries is enormous. 

7 9 11 
Libraries of 10 -10 independent clones are routinely prepared. Libraries as large as 10 

recombinants have been created, but this size approaches the practical limit for clone 
1 5 libraries. This limitation in library size occurs at the step of transforming the DNA 

containing randomized segments into the host bacterial cells. To circumvent this limitation, 
an in vitro system based on the display of nascent peptides in polysome complexes has 
. recently been developed. This display library method has the potential of producing libraries 
3-6 orders of magnitude larger than the currently available phage/phagemid or plasmid 
20 libraries. Furthermore, the construction of the libraries, expression of the peptides, and 
screening, is done in an entirely cell-free format. 

In one application of this method (Gallop et al. (1994) J. Med Chem. 37(9): 1233- 
1251), a molecular DNA library encoding 10 12 decapeptides'was constructed and the library 
expressed in an E. coli S30 in vitro coupled transcription/translation system. Conditions 
25 were chosen to stall the ribosomes on the mRNA, causing the accumulation of a substantial 
proportion of the RNA in polysomes and yielding complexes containing nascent peptides 
still linked to their encoding RNA. The polysomes are sufficiently robust to be affinity 
purified on immobilized receptors in much the same way as the more conventional 
recombinant peptide display libraries are screened. RNA from the bound complexes is 
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recovered, converted to cDNA, and amplified by PCR to produce a template for the next 
round of synthesis and screening. The polysome display method can be coupled to the phage 
display system. Following several rounds of screening, cDNA from the enriched pool of 
polysomes was cloned into a phagemid vector. This vector serves as both a peptide 
5 expression vector, displaying peptides fused to the coat proteins, and as a DNA sequencing 
vector for peptide identification. By expressing the polysome-derived peptides on phage, 
one can either continue the affinity selection procedure in this format or assay the peptides 
on individual clones for binding activity in a phage ELISA, or for binding specificity in a 
completion phage ELISA (Barret, et al. (1992) Anal. Biochem 204,357-364). To identify the 
1 0 sequences of the active peptides one sequences the DNA produced by the phagemid host. 

SECONDARY SCREENING OF POLYPEPTIDES AND ANALOGS 

The high through-put assays described above can be followed by secondary screens 

in order to identify further biological activities which will, e.g., allow one skilled in the art to 
1 5 differentiate agonists from antagonists. The type of a secondary screen used will depend on 

the desired activity that needs to be tested. For example, an assay can be developed in which 

the ability to inhibit an interaction between a protein of interest and its respective ligand can 

be used to identify antagonists from a group of peptide fragments isolated though one of the 

primary screens described above. 
20 Therefore, methods for generating fragments and analogs and testing them for 

activity are known in the art. Once the core sequence of interest is identified, it is routine for 

one skilled in the art to obtain analogs and fragments. 

PEPTIDE MIMETICS OFM CATARRHALIS POLYPEPTIDES 
25 The invention also provides for reduction of the protein binding domains of the 

subject M. catarrhalis polypeptides to generate mimetics, e.g. peptide or non-peptide agents. 
The peptide mimetics are able to disrupt binding of a polypeptide to its counter ligand, e.g., 
in the case of an M. catarrhalis polypeptide binding to a naturally occurring ligand. The 
critical residues of a subject M. catarrhalis polypeptide which are involved in molecular 
30 recognition of a polypeptide can be determined and used to generate M. catarrhalis -derived 
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peptidomimetics which competitively or noncompetitively inhibit binding of the M 
catarrhalis polypeptide with an interacting polypeptide (see, for example, European patent 
applications EP-4 12,762 A and EP-B31,080A). 

For example, scanning mutagenesis can be used to map the amino acid residues of a 
5 particular M catarrhalis polypeptide involved in binding an interacting polypeptide, 
peptidomimetic compounds (e.g. diazepine or isoquinoline derivatives) can be generated 
which mimic those residues in binding to an interacting polypeptide, and which therefore can 
inhibit binding of an M. catarrhalis polypeptide to an interacting polypeptide and thereby 
interfere with the function of M catarrhalis polypeptide. For instance, non-hydrolyzable 

10 peptide analogs of such residues can be generated using benzodiazepine (e.g., see Freidinger 
et al. in Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM Publisher: Leiden, 
Netherlands, 1988), azepine (e.g., see Huffman et al. in Peptides: Chemistry and Biology, 
G.R. Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988), substituted gama lactam 
rings (Garvey et al. in Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM 

15 Publisher: Leiden, Netherlands, 1988), keto-methylene pseudopeptides (Ewenson et al. 
(1986) J Med Chem 29:295; and Ewenson et al. in Peptides: Structure and Function 
(Proceedings of the 9th American Peptide Symposium) Pierce Chemical Co. Rockland, IL, 
1985), b-turn dipeptide cores (Nagai et al. (1985) Tetrahedron Lett 26:647; and Sato et al. 
(1986) J Chem Soc Perkin Trans 1:1231), and b-aminoalcohols (Gordon et al. (1985) 

20 Biochem Biophys Res Commun 126:419; and et al. (1986) Biochem Biophys Res Commun 
134:71). 

VACCINE FORMULATIONS FOR M. CATARRHALIS NUCLEIC ACIDS AND 
POLYPEPTIDES 

25 This invention also features vaccine compositions for protection against infection by 

M catarrhalis or for treatment of M. catarrhalis infection. In one embodiment, the vaccine 
compositions contain one or more immunogenic components such as a surface protein from 
M. catarrhalis , or portion thereof, and a pharmaceutical^ acceptable carrier. Nucleic acids 
within the scope of the invention are exemplified by the nucleic acids of the invention 

30 contained in the Sequence Listing which encode M catarrhalis surface proteins. Any 
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nucleic acid encoding an immunogenic M. catarrhalis protein, or portion thereof, which is 
capable of expression in a cell, can be used in the present invention. These vaccines have 
therapeutic and prophylactic utilities. 

One aspect of the invention provides a vaccine composition for protection against 
5 infection by M. catarrhalis which contains at least one immunogenic fragment of an M 
catarrhalis protein and a pharmaceutical^ acceptable carrier. Preferred fragments include 
peptides of at least about 10 amino acid residues in length, preferably about 10-20 amino 
acid residues in length, and more preferably about 12-16 amino acid residues in length. u 
Immunogenic components of the invention can be obtained, for example, by 

10 screening polypeptides recombinantly produced from the corresponding fragment of the 
nucleic acid encoding the full-length M catarrhalis protein. In addition, fragments can be 
chemically synthesized using techniques known in the art such as conventional Merrifield 
solid phase f-Moc or t-Boc chemistry. 

In one embodiment, immunogenic components are identified by the ability of the 

1 5 peptide to stimulate T cells. Peptides which stimulate T cells, as determined by, for 

example, T cell proliferation or cytokine secretion are defined herein as comprising at least 
one T cell epitope. T cell epitopes are believed to be involved in initiation and perpetuation 
of the immune response to the protein allergen which is responsible for the clinical 
symptoms of allergy. These T cell epitopes are thought to trigger early events at the level of 

20 the T helper cell by binding to an appropriate HLA molecule on the surface of an antigen 

presenting cell, thereby stimulating the T cell subpopulation with the relevant T cell receptor 
for the epitope. These events lead to T cell proliferation, lymphokine secretion, local 
inflammatory reactions, recruitment of additional immune cells to the site of antigen/T cell 
interaction, and activation of the B cell cascade, leading to the production of antibodies. A T 

25 cell epitope is the basic element, or smallest unit of recognition by a T cell receptor, where 
the epitope comprises amino acids essential to receptor recognition (e.g., approximately 6 or 
7 amino acid residues). Amino acid sequences which mimic those of the T cell epitopes are 
within the scope of this invention. 

Screening immunogenic components can be accomplished using one or more of 

30 several different assays. For example, in vitro, peptide T cell stimulatory activity is assayed 
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by contacting a peptide known or suspected of being immunogenic with an antigen 
presenting cell which presents appropriate MHC molecules in a T cell culture. Presentation 
of an immunogenic M. catarrhalis peptide in association with appropriate MHC molecules 
to T cells in conjunction with the necessary co-stimulation has the effect of transmitting a 
5 signal to the T cell that induces the production of increased levels of cytokines, particularly 
of interleukin-2 and interleukin-4. The culture supernatant can be obtained and assayed for 
interleukin-2 or other known cytokines. For example, any one of several conventional assays 
for interleukin-2 can be employed, such as the assay described in Proc. Natl Acad. Sci USA, 
86 : 1333 (1989) the pertinent portions of which are incorporated herein by reference. A kit 
10 for an assay for the production of interferon is also available from Genzyme Corporation 
(Cambridge, MA). 

Alternatively, a common assay for T cell proliferation entails measuring tritiated 
thymidine incorporation. The proliferation of T cells can be measured in vitro by 

3 

determining the amount of H-labeled thymidine incorporated into the replicating DNA of 

1 5 cultured cells. Therefore, the rate of DNA synthesis and, in turn, the rate of cell division can 
be quantified. 

Vaccine compositions of the invention containing immunogenic components (e.g., 
M. catarrhalis polypeptide or fragment thereof or nucleic acid encoding an M. catarrhalis 
polypeptide or fragment thereof) preferably include a pharmaceutically acceptable carrier. 

20 The term "pharmaceutically acceptable carrier" refers to a carrier that does not cause an 
allergic reaction or other untoward effect in patients to whom it is administered. Suitable 
pharmaceutically acceptable carriers include, for example, one or more of water, saline, 
phosphate buffered saline, dextrose, glycerol, ethanol and the like, as well as combinations 
thereof. Pharmaceutically acceptable carriers may further comprise minor amounts of 

25 auxiliary substances such as wetting or emulsifying agents, preservatives or buffers, which 
enhance the shelf life or effectiveness of the antibody. For vaccines of the invention 
containing M catarrhalis polypeptides, the polypeptide is co-administered with a suitable 
adjuvant. 



-59- 



Applicant's Docket No.: PATH(b-14 



It will be apparent to those of skill in the art that the therapeutically effective amount 
of DNA or protein of this invention will depend, inter alia, upon the administration 
schedule, the unit dose of antibody administered, whether the protein or DNA is 
administered in combination with other therapeutic agents, the immune status and health of 
5 the patient, and the therapeutic activity of the particular protein or DNA. 

Vaccine compositions are conventionally administered parenterally, e.g., by injection, 
either subcutaneously or intramuscularly. Methods for intramuscular immunization are 
described by Wolff et al. (1990) Science 247: 1465-1468 and by Sedegah et al. (1994) 
Immunology 91: 9866-9870. Other modes of administration include oral and pulmonary 
1 0 formulations, suppositories, and transdermal applications. Oral immunization is preferred 
over parenteral methods for inducing protection against infection by M. catarrhalis . Cain 
et. al. (1993) Vaccine 1 1 : 637-642. Oral formulations include such normally employed 
excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium 
stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. 
15 The vaccine compositions of the invention can include an adjuvant, including, but 

not limited to aluminum hydroxide; N-acetyl-muramyl~L-threonyl-D-isoglutamine (thr- 
MDP); N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 1 1637, referred to as nor- 
MDP); N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanme-2-(r-2'-dipalmitoyl-sn- 
glyceror3-hydroxyphos-phoryloxy)-ethylamine (CGP 19835 A, referred to a MTP-PE); RIBI, 
20 which contains three components from bacteria; monophosphoryl lipid A; trehalose 
dimycoloate; cell wall skeleton (MPL + TDM + CWS) in a 2% squalene/Tween 80 
emulsion; and cholera toxin. Others which may be used are non-toxic derivatives of cholera 
toxin, including its B subunit, and/or conjugates or genetically engineered fusions of the M 
- catarrhalis polypeptide with cholera toxin or its B subunit, procholeragenoid, fungal 
25 polysaccharides, including schizophyllan, muramyl dipeptide, muramyl dipeptide 

derivatives, phorbol esters, labile toxin of E. coli, non-M catarrhalis bacterial lysates, block 
polymers or saponins. 

Other suitable delivery methods include biodegradable microcapsules or immuno- 
stimulating complexes (ISCOMs), cochleates, or liposomes, genetically engineered 
30 attenuated live vectors such as viruses or bacteria, and recombinant (chimeric) virus-like 
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particles, e.g., bluetongue. The amount of adjuvant employed will depend on the type of 
adjuvant used. For example, when the mucosal adjuvant is cholera toxin, it is suitably used 
in an amount of 5 mg to 50 mg, for example 1 0 mg to 35 mg. When used in the form of 
microcapsules, the amount used will depend on the amount employed in the matrix of th^ 
5 microcapsule to achieve the desired dosage. The determination of this amount is within the 
skill of a person of ordinary skill in the art. 

Carrier systems in humans may include enteric release capsules protecting the 
antigen from the acidic environment of the stomach, and including M catarrhalis 
polypeptide in an insoluble form as fusion proteins. Suitable carriers for the vaccines of the 

10 invention are enteric coated capsules and polylactide-glycolide microspheres. Suitable 
diluents are 0.2 N NaHC0 3 and/or saline. 

Vaccines of the invention can be administered as a primary prophylactic agent in 
adults or in children, as a secondary prevention, after successful eradication of M 
catarrhalis in an infected host,, or as a therapeutic agent in the aim to induce an immune 

15 response in a susceptible host to prevent infection by M. catarrhalis . The vaccines of the 
invention are administered in amounts readily determined by persons of ordinary skill in the 
art. Thus, for adults a suitable dosage will be in the range of 10 mg to 10 g, preferably 10 
mg to 1 00 mg. A suitable dosage for adults will also be in the range of 5 mg to 500 mg. 
Similar dosage ranges will be applicable for children. Those skilled in the art will recognize 

20 that the optimal dose may be more or less depending upon the patient's body weight, disease, 
the route of administration, and other factors. Those skilled in the art will also recognize 
that appropriate dosage levels can be obtained based on results with known oral vaccines 
such as, for example, a vaccine based on an E. coli lysate (6 mg dose daily up to total of 540 
mg) and with an enterotoxigenic E. coli purified antigen (4 doses of 1 mg) (Schulman et aL, 

25 J, Urol 150:917-921 (1993); Boedecker et ^American Gastroenterological Assoc. 999: A- 
222 (1993)). The number of doses will depend upon the disease, the formulation, and 
efficacy data from clinical trials. Without intending any limitation as to the course of 
treatment, the treatment can be administered over 3 to 8 doses for a primary immunization 
schedule over 1 month (Boedeker, American Gastroenterological Assoc. 888M-222 (1993)). 
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In a preferred embodiment, a vaccine composition of the invention can be based on a 
killed whole E. coli preparation with an immunogenic fragment of an M. catarrhalis protein 
of the invention expressed on its surface or it can be based on an E. coli lysate, wherein the 
killed E. coli acts as a carrier or an adjuvant. 
5 It will be apparent to those skilled in the art that some of the vaccine compositions of 

the invention are useful only for preventing M. catarrhalis infection, some are useful only 
for treating M catarrhalis infection, and some are useful for both preventing and treating M. 
catarrhalis infection. In a preferred embodiment, the vaccine composition of the invention 
provides protection against M catarrhalis infection by stimulating humoral and/or cell- 
10 mediated immunity against M. catarrhalis . It should be understood that amelioration of any 
of the symptoms of M. catarrhalis infection is a desirable clinical goal, including a lessening 
of the dosage of medication used to treat M catarrhalis -caused disease, or an increase in the 
production of antibodies in the serum or mucous of patients. - 

1 5 ANTIBODIES REACTIVE WITH M CA TARRHALIS POLYPEPTIDES 

The invention also includes antibodies specifically reactive with the subject M. 
catarrhalis polypeptide. Anti-protein/anti-peptide antisera or monoclonal antibodies can be 
made by standard protocols (See, for example, Antibodies: A Laboratory Manual ed. by 
Harlow and Lane (Cold Spring Harbor Press: 1988)). A mammal such as a mouse, a hamster 

20 or rabbit can be immunized with an immunogenic form of the peptide. Techniques for 
conferring immunogenicity on a protein or peptide include conjugation to carriers or other 
techniques well known in the art. An immunogenic portion of the subject M catarrhalis 
polypeptide can be administered in the presence of adjuvant. The progress of immunization 
can be monitored by detection of antibody titers in plasma or serum. Standard ELISA or 

25 other immunoassays can be used with the immunogen as antigen to assess the levels of 
antibodies. 

In a preferred embodiment, the subject antibodies are immunospecific for antigenic 
determinants of the M. catarrhalis polypeptides of the invention, e.g. antigenic determinants 
of a polypeptide of the invention contained in the Sequence Listing, or a closely related 
30 human or non-human mammalian homolog (e.g., 90% homologous, more preferably at least 
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about 95% homologous). In yet a further preferred embodiment of the invention, the anti-M 
catarrhalis antibodies do not substantially cross react (i.e., react specifically) with a protein 
which is for example, less than 80% percent homologous to a sequence of the invention 
contained in the Sequence Listing. By "not substantially cross react", it is meant that the 
antibody has a binding affinity for a non-homologous protein which is less than 10 percent, 
more preferably less than 5 percent, and even more preferably less than 1 percent, of the 
binding affinity for a protein of the invention contained in the Sequence Listing. In a most 
preferred embodiment, there is no cross-reactivity between bacterial and mammalian 
antigens. 

The term antibody as used herein is intended to include fragments thereof which are 
also specifically reactive with M. catarrhalis polypeptides. Antibodies can be fragmented 
using conventional techniques and the fragments screened for utility in the same manner as 

described above for whole antibodies. For example, F(ab') 2 fragments can be generated by 
treating antibody with pepsin. The resulting F(ab') 2 fragment can be treated to reduce 
disulfide bridges to produce Fab 1 fragments. The antibody of the invention is further 
intended to include bispecific and chimeric molecules having an anti-M catarrhalis portion. 

Both monoclonal and polyclonal antibodies (Ab) directed against M. catarrhalis 
polypeptides or M. catarrhalis polypeptide variants, and antibody fragments such as FaK 

and F(ab^) 2 , can be used to block the action of M catarrhalis polypeptide and allow the 
study of the role of a particular M. catarrhalis polypeptide of the invention in aberrant or 
unwanted intracellular signaling, as well as the normal cellular function of the M catarrhalis 
and by microinjection of anti-M catarrhalis polypeptide antibodies of the present invention. 

Antibodies which specifically bind M catarrhalis epitopes can also be used in 
immunohistochemical staining of tissue samples in order to evaluate the abundance and 
pattern of expression of M catarrhalis antigens. Anti-M catarrhalis polypeptide antibodies 
can be used diagnostically in immuno-precipitation and immuno-blotting to detect and 
evaluate M catarrhalis levels in tissue or bodily fluid as part of a clinical testing procedure. 
Likewise, the ability to monitor M catarrhalis polypeptide levels in an individual can allow 
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determination of the efficacy of a given treatment regimen for an individual afflicted with 
such a disorder. The level of an M. catarrhalis polypeptide can be measured in cells found 
in bodily fluid, such as in urine samples or can be measured in tissue, such as produced by 
gastric biopsy. Diagnostic assays using anti-M catarrhalis antibodies can include, for 
5 example, immunoassays designed to aid in early diagnosis of M. catarrhalis infections. The 
present invention can also be used as a method of detecting antibodies contained in samples 
from individuals infected by this bacterium using specific M. catarrhalis antigens. 

Another application of anti-M catarrhalis polypeptide antibodies of the invention is 
in the immunological screening of cDNA libraries constructed in expression vectors such as 

10 A,gtl 1, Xgtl 8-23, A,ZAP, and A,ORF8. Messenger libraries of this type, having coding 

sequences inserted in the correct reading frame and orientation, can produce fusion proteins. 
For instance, A,gtl 1 will produce fusion proteins whose amino termini consist of B- 
galactosidase amino acid sequences and whose carboxy termini consist of a foreign 
polypeptide. Antigenic epitopes of a subject M catarrhalis polypeptide can then be detected 

1 5 with antibodies, as, for example, reacting nitrocellulose filters lifted from infected plates 
with anti-M catarrhalis polypeptide antibodies. Phage, scored by this assay, can then be 
isolated from the infected plate. Thus, the presence of M catarrhalis gene homologs can be 
detected and cloned from other species, and alternate isoforms (including splicing variants) 
can be detected and cloned. 

20 

KITS CONTAINING NUCLEIC ACIDS, POLYPEPTIDES OR ANTIBODIES OF THE 
INVENTION 

The nucleic acid, polypeptides and antibodies of the invention can be combined with 
other reagents and articles to form kits. Kits for diagnostic purposes typically comprise the 

25 nucleic acid, polypeptides or antibodies in vials or other suitable vessels. Kits typically 

comprise other reagents for performing hybridization reactions, polymerase chain reactions 
(PGR), or for reconstitution of lyophilized components, such as aqueous media, salts, 
buffers, and the like. Kits may also comprise reagents for sample processing such as 
detergents, chaotropic salts and the like. Kits may also comprise immobilization means such 

30 as particles, supports, wells, dipsticks and the like. Kits may also comprise labeling means 
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such as dyes, developing reagents, radioisotopes, fluorescent agents, luminescent or 
chemiluminescent agents, enzymes, intercalating agents and the like. With the nucleic acid 
and amino acid sequence information provided herein, individuals skilled in art can readily 
assemble kits to serve their particular purpose. Kits further can include instructions for use. 

5 

BIO CHIP TECHNOLOGY 

The nucleic acid sequence of the present invention may be used to detect M. 
catarrhalis or other species of Moraxella acid sequence using bio chip technology. Bio chips 
containing arrays of nucleic acid sequence can also be used to measure expression of genes 

10 of M. catarrhalis or other species of Moraxella, For example, to diagnose a patient with a 
M. catarrhalis or other Moraxella infection, a sample from a human or animal can be used 
as a probe on a bio chip containing an array of nucleic acid sequence from the present 
invention. In addition, a sample from a disease state can be compared to a sample from a 
non-disease state which would help identify a gene that is up-regulated or expressed in the 

15 disease state. This would provide valuable insight as to the mechanism by which the disease 
manifests. Changes in gene expression can also be used to identify critical pathways 
involved in drug transport or metabolism, and may enable the identification of novel targets 
involved in virulence or host cell interactions involved in maintenance of an infection. 
Procedures using such techniques have been described by Brown et al, 1995, Science 270: 

20 467-470.' 

Bio chips can also be used to monitor the genetic changes of potential therapeutic 
compounds including, deletions, insertions or mismatches. Once the therapeutic is added to 
the patient, changes to the genetic sequence can be evaluated for its efficacy. In addition, the 
nucleic acid sequence of the present invention can be used to determine essential genes in 

25 cell cycling. As described in Iyer et al, 1999 {Science, 283:83-87 ) genes essential in the 
cell cycle can be identified using bio chips. Furthermore, the present invention provides 
nucleic acid sequence which can be used with bio chip technology to understand regulatory 
networks in bacteria, measure the response to environmental signals or drugs as in drug 
screening, and study virulence induction. (Mons et al, 1998, Nature Biotechnology, 16: 45- 

30 48. Patents teaching this technology include U.S. Patents 5445934, 5744305, and 5800992. 
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DRUG SCREENING ASSAYS USING M CATARRHALIS POLYPEPTIDES 

By making available purified and recombinant M catarrhalis polypeptides, the 
present invention provides assays which can be used to screen for drugs which are either 
5 agonists or antagonists of the normal cellular function, in this case, of the subject M 
catarrhalis polypeptides, or of their role in intracellular signaling. Such inhibitors or 
potentiators may be useful as new therapeutic agents to combat M catarrhalis infections in 
humans. A variety of assay formats will suffice and, in light of the present inventions, will 
be comprehended by the person skilled in the art. 

1 0 In many drug screening programs which test libraries of compounds and natural 

extracts, high throughput assays are desirable in order to maximize the number of 
compounds surveyed in a given period of time. Assays which are performed in cell-free 
- systems, such as may be derived with purified or semi-purified proteins, are often preferred 
as "primary 1 ' screens in that they can be generated to permit rapid development and relatively 

1 5 easy detection of an alteration in a molecular target which is mediated by a test compound. 
Moreover, the effects of cellular toxicity and/or bioavailability of the test compound can be 
generally ignored in the in vitro system, the assay instead being focused primarily on the 
effect of the drug on the molecular target as may be manifest in an alteration of binding 
affinity with other proteins or change in enzymatic properties of the molecular target. 

20 Accordingly, in an exemplary screening assay of the present invention, the compound of 
interest is contacted with an isolated and purified M. catarrhalis polypeptide. 

Screening assays can be constructed in vitro with a purified M. catarrhalis 
polypeptide or fragment thereof, such as an M catarrhalis polypeptide having enzymatic 
activity, such that the activity of the polypeptide produces a detectable reaction product. The 

25 efficacy of the compound can be assessed by generating dose response curves from data 
obtained using various concentrations of the test compound. Moreover, a control assay can 
also be performed to provide a baseline for comparison. Suitable products include those 
with distinctive absorption, fluorescence, or chemi-luminescence properties, for example, 
because detection may be easily automated. A variety of synthetic or naturally occurring 

30 compounds can be tested in the assay to identify those which inhibit or potentiate the activity 
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of the M. catarrhalis polypeptide. Some of these active compounds may directly, or with 
chemical alterations to promote membrane permeability or solubility, also inhibit or 
potentiate the same activity (e.g., enzymatic activity) in whole, live M catarrhalis cells. 

5 OVEREXPRESSION ASSAYS 

Overexpression assays are based on the premise that overproduction of a protein 
would lead to a higher level of resistance to compounds that selectively interfere with the 
function of that protein. Overexpression assays may be used to identify compounds that 
interfere with the function of virtually any type of protein, including without limitation 

10 enzymes, receptors, DNA- or RNA-binding proteins, or any proteins that are directly or 
indirectly involved in regulating cell growth. 

Typically, two bacterial strains are constructed. One contains a single copy of the 
gene of interest, and a second contains several copies of the same gene. Identification of 
useful inhibitory compounds of this type of assay is based on a comparison of the activity of 

.15 a test compound in inhibiting growth and/or viability of the two strains. The method 

involves constructing a nucleic acid vector that directs high level expression of a particular 
target nucleic acid. The vectors are then transformed into host cells in single or multiple 
copies to produce strains that express low to moderate and high levels of protein encoding by 
the target sequence (strain A and B, respectively). Nucleic acid comprising sequences 

20 encoding the target gene can, of course, be directly integrated into the host cell. 

Large numbers of compounds (or crude substances which may contain active 
compounds) are screened for their effect on the growth of the two strains. Agents which 
interfere with an unrelated target equally inhibit the growth of both strains. Agents which 
interfere with the function of the target at high concentration should inhibit the growth of 

25 both strains. It should be possible, however, to titrate out the inhibitory effect of the 

compound in the overexpressing strain. That is, if the compound is affecting the particular 
target that is being tested, it should be possible to inhibit the growth of strain A at a 
concentration of the compound that allows strain B to grow. 

Alternatively, a bacterial strain is constructed that contains the gene of interest under 

30 the control of an inducible promoter. Identification of useful inhibitory agents using this 
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type of assay is based on a comparison of the activity of a test compound in inhibiting 
growth and/or viability of this strain under both inducing and non-inducing conditions. The 
method involves constructing a nucleic acid vector that directs high-level expression of a 
particular target nucleic acid. The vector is then transformed into host cells that are grown 
5 under both non-inducing and inducing conditions (conditions A and B, respectively). 

Large numbers of compounds (or crude substances which may contain active 
compounds) are screened for their effect on growth under these two conditions. Agents that 
interfere with the function of the target should inhibit growth under both conditions. It 
should be possible, however, to titrate out the inhibitory effect of the compound in the 
1 0 overexpressing strain. That is, if the compound is affecting the particular target that is being 
tested, it should be possible to inhibit growth under condition A at a concentration that 
allows the strain to grow under condition B. 

LIGAND-BINDING ASSAYS 

1 5 Many of the targets according to the invention have functions that have not yet been 

identified. Ligand-binding assays are useful to identify inhibitor compounds that interfere 
with the function of a particular target, even when that function is unknown. These assays 
are designed to detect binding of test compounds to particular targets. The detection may 
involve direct measurement of binding. Alternatively, indirect indications of binding may 

20 involve stabilization of protein structure or disruption of a biological function. Non-limiting 
examples of useful ligand-binding assays are detailed below. 

A useful method for the detection and isolation of binding proteins is the 
Biomolecular Interaction Assay (BIAcore) system developed by Pharmacia Biosensor and 
described in the manufacturer's protocol (LKB Pharmacia, Sweden). The BIAcore system 

25 uses an affinity purified anti-GST antibody to immobilize GST-fusion proteins onto a sensor 
chip. The sensor utilizes surface plasmon resonance which is an optical phenomenon that 
detects changes in refractive indices. In accordance with the practice of the invention, a 
protein of interest is coated onto a chip and test compounds are passed over the chip. 
Binding is detected by a change in the refractive index (surface plasmon resonance). 
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A different type of ligand-binding assay involves scintillation proximity assays (SPA, 
described in U.S. Patent No. 4,568,649). 

. Another type of ligand binding assay, also undergoing development, is based on the 
fact that proteins containing mitochondrial targeting signals are imported into isolated 
5 mitochondria in vitro (Hurt et al, 1985, Embo J. 4:2061-2068; Eilers and Schatz, Nature, 
1986, 322:228-231). In a mitochondrial import assay, expression vectors are constructed in 
which nucleic acids encoding particular target proteins are inserted downstream of sequences 
encoding mitochondrial import signals. The chimeric proteins are synthesized and tested for 
their ability to be imported into isolated mitochondria in the absence and presence of test 

10 compounds. A test compound that binds to the target protein should inhibit its uptake into 
isolated mitochondria in vitro. 

Another ligand-binding assay is the yeast two-hybrid system (Fields and Song, 1989, 
- Nature 340:245-246). The yeast two-hybrid system takes advantage of the properties of the 
GAL4 protein of the yeast Saccharomyces cerevisiae. The GAL4 protein is a transcriptional 

1 5 activator required for the expression of genes encoding enzymes of galactose utilization. 
This protein consists of two separable and functionally essential domains: an N-terminal 
domain which binds to specific DNA sequences (UASq); and a C-terminal domain 
containing acidic regions, which is necessary to activate transcription. The native GAL4 
protein, containing both domains, is a potent activator of transcription when yeast are grown' 

20 on galactose media. The N-terminal domain binds to DNA in a sequence-specific manner 
but is unable to activate transcription. The C-terminal domain contains the activating 
regions but cannot activate transcription because it fails to be localized to UASq. In the two- 
hybrid system, a system of two hybrid proteins containing parts of GAL4: (1) a GAL4 
DNA-binding domain fused to a protein 'X 1 and (2) a GAL4 activation region fused to a 

25 protein T'. If X and Y can form a protein-protein complex and reconstitute proximity of the 
GAL4 domains, transcription of a gene regulated by UASq occurs. Creation of two hybrid 
proteins, each containing one of the interacting proteins X and Y, allows the activation 
region of UASq to be brought to its normal site of action. 
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The binding assay described in Fodor et ah, 1991, Science 251 :767-773, which 
involves testing the binding affinity of test compounds for a plurality of defined polymers 
synthesized on a solid substrate, may also be useful. 

Compounds which bind to the polypeptides of the invention are potentially useful as 
5 antibacterial agents for use in therapeutic compositions. 

Pharmaceutical formulations suitable for antibacterial therapy comprise the 
antibacterial agent in conjunction with one or more biologically acceptable carriers. Suitable 
biologically acceptable carriers include, but are not limited to, phosphate-buffered saline, 
saline, deionized water, or the like. Preferred biologically acceptable carriers are 
1 0 physiologically or pharmaceutical^ acceptable carriers. 

The antibacterial compositions include an antibacterial effective amount of active 
agent. Antibacterial effective amounts are those quantities of the antibacterial agents of the 
present invention that afford prophylactic protection against bacterial infections or which - 
result in amelioration or cure of an existing bacterial infection. This antibacterial effective 
15 amount will depend upon the agent, the location and nature of the infection, and the 

particular host. The amount can be determined by experimentation known in the art, such as 
by establishing a matrix of dosages and frequencies and comparing a group of experimental 
units or subjects to each point in the matrix. 

The antibacterial active agents or compositions can be formed into dosage unit forms, 
20 such as for example, creams, ointments, lotions, powders, liquids, tablets, capsules, 

suppositories, sprays, aerosols or the like. If the antibacterial composition is formulated into 
a dosage unit form, the dosage unit form may contain an antibacterial effective amount of 
active agent. Alternatively, the dosage unit form may include less than such an amount if 
multiple dosage unit forms or multiple dosages are to be used to administer a total dosage of 
25 the active agent. Dosage unit forms can include, in addition, one or more excipient(s), 
diluent(s), disintegrant(s), lubricant(s), plasticizer(s), colorant(s), dosage vehicle(s), 
absorption enhancer(s), stabilizer(s), bactericide(s), or the like. 

For general information concerning formulations, see, e.g., Gilman et al. (eds.), 1990, 
Goodman and Oilman's: The Pharmacological Basis of Therapeutics, 8th ed., Pergamon 
30 Press; and Remington's Pharmaceutical Sciences, 17th ed., 1990, Mack Publishing Co., 
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Easton, PA; Avis et al. (eds.), 1993, Pharmaceutical Dosage Forms: Parenteral 
Medications, Dekker, New York; Lieberman et al (eds.) 5 1990, Pharmaceutical Dosage 
Forms: Disperse Systems, Dekker, New York. 

The antibacterial agents and compositions of the present invention are useful for 
5 preventing or treating M catarrhalis infections. Infection prevention methods incorporate a 
prophylactically effective amount of an antibacterial agent or composition. A 
prophylactically effective amount is an amount effective to prevent M. catarrhalis infection 
and will depend upon the specific bacterial strain, the agent, and the host. These amounts 
can be determined experimentally by methods known in the art and as described above. 

1 0 M. catarrhalis infection treatment methods incorporate a therapeutically effective 

amount of an antibacterial agent or composition. A therapeutically effective amount is an 
amount sufficient to ameliorate or eliminate the infection. The prophylactically and/or 
therapeutically effective amounts can be administered in one administration or over repeated 
administrations. Therapeutic administration can be followed by prophylactic administration, 

1 5 once the initial bacterial infection has been resolved. 

The antibacterial agents and compositions can be administered topically or 
systemically. Topical application is typically achieved by administration of creams, 
ointments, lotions, or sprays as described above. Systemic administration includes both oral 
and parental routes. Parental routes include, without limitation, subcutaneous, 

20 intramuscular, intraperitoneal, intravenous, transdermal, inhalation and intranasal 
administration. 

EXEMPLIFICATION 

25 CLONING AND SEQUENCING M. CATARRHALIS GENOMIC SEQUENCE 

This invention provides nucleotide sequences of the genome of M catarrhalis which 
thus comprises a DNA sequence library of M catarrhalis genomic DNA. The invention also 
provides nucleotide sequences of two naturally occurring plasmids in M. catarrhalis. The 
detailed description that follows provides nucleotide sequences of M. catarrhalis, and also 

30 describes how the sequences were obtained and how ORFs (Open Reading Frames) and 
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protein-coding sequences can be identified. Also described are methods of using the 
disclosed M catarrhalis sequences in methods including diagnostic and therapeutic 
applications. Furthermore, the library can be used as a database for identification and 
comparison of medically important sequences in this and other strains of M catarrhalis as 
5 well as other species of Moraxella. 

Chromosomal DNA from strain 98-4362. of M. catarrhalis, was isolated using a 
protocol described by Storrs, et al.(J. Bacteriol 173: 4347-4352 (1991). The only exception 
to this protocol was that lysostaphin (120 U/ml) was used instead of lysozyme. The genomic 
DNA prep involved a lysozyme: lysostaphin digestion, sodium dodecyl sulfate lysis, 

10 Proteinase K and RNase treatment, phenol: chloroform extraction, and sodium acetate 
precipitation, followed by the CsCl gradient to remove the plasmid. 

In the construction of both libraries, genomic M. catarrhalis DNA was 
hydrodynamically sheared in an HPLC and then separated on a standard 1% agarose gel. A 
fraction corresponding to 2000-3000 bp in length was excised from the gel and purifed by 

1 5 the GeneClean procedure (Bio 1 0 1 , Inc.). 

The purified DNA fragments were then blunt-ended using T4 DNA polymerase. The 
healed DNA was then ligated to unique BstXI-linker adapters (5'-GTCTTCACCACGGGG- 
V and 5 '-GTGGTGAAGAC-3 5 in 100-1000 fold molar excess). These linkers are 
complimentary to the BstXI-cut pGTC vector, while the overhang is not self-complimentary. 

20 Therefore, the linkers will not concatermerize nor will the cut-vector religate itself easily. 
The linker-adapted inserts were separated from the unincorporated linkers on a 1% agarose 
gel and purified using GeneClean. The linker-adapted inserts were then ligated to BstXL-cut 
vector to construct a "shotgun" sublclone libraries. 

Only major modifications to the protocols are highlighted. Briefly, the library was 

25 then transformed into DH5a competent cells (Gibco/BRL, DH5a transformation protocol). 
It was assessed by plating onto antibiotic plates containing ampicillin and IPTG/Xgal. The 
plates were incubated overnight at 37°C. Transformants were then used for plating of 
clones and picking for sequencing. The cultures were grown overnight at 37°C. DNA was 
purified using a silica bead DNA preparation (Engelstein, 1996) method. In this manner, 25 

30 jig of DNA was obtained per clone. 
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These purified DNA samples were then sequenced using primarily ABI dye- 
terminator chemistry. All subsequent steps were based on sequencing by ABI377 automated 
DNA sequencing methods. The ABI dye terminator sequence reads were run on ABI377 
machines and the data was transferred to UNIX machines following lane tracking of the 
5 gels. Base calls and quality scores were determined using the program PHRED (Ewing et 
al. 3 1998, Genome Res. 8: 175-185; Ewing and Green, 1998, Genome Res. 8: 685-734). 
Reads were assembled using PHRAP (P. Green, Abstracts of DOE Human Genome Program 
Contractor-Grantee Workshop V, Jan. 1996, p. 157) with default program parameters and 
quality scores. 

10 Finishing followed the initial assembly. Missing mates (sequences from clones that 

only gave reads from one end of the Moraxella DNA inserted in the plasmid) were 
identified and sequenced with ABI technology to allow the identification of additional 
overlapping contigs. - 
End-sequencing of randomly picked genomic lambda was also performed. 

15 . Sequencing of both sides was done for all lambda sequences. The lambda library backbone 
helped to verify the integrity of the assembly and allowed closure of some of the physical 
gaps. Primers for walking off the ends of contigs would be selected using pick_primer ( a 
GTC program) near the ends of the clones to facilitate gap closure. These walks can be 
sequenced using the selected clones and primers. These data are then reassembled with 

20 PHRAP. Additional sequencing using PCR-generated templates and screened and/or 
unscreened lambda templates can be done in addition. 

Additional templates for the physical gaps were obtained through PCR using primers 
designed from the ends of the contigs. These templates were then used in sequencing 
reactions to close the gaps. 

25 Contigs were ordered by aligning identified M catarrhalis genes to the published 

physical maps. Order was confirmed by PCR. The final chromosomal assembly included 
119 contigs. 
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To identify M catarrhalis polypeptides the complete genomic sequence of M 
catarrhalis were analyzed essentially as follows: First, all possible stop-to- stop open reading 
frames (ORFs) greater than 1 80 nucleotides in all six reading frames were translated into 
amino acid sequences. Second, the identified ORFs were analyzed for homology to known 
(archeabacter, prokaryotic and eukaryotic) protein sequences. Third, the coding potential of 
non-homologous sequences were evaluated with the program GENEMARKTM 
(Borodovsky and Mclninch, 1993, Comp. Chem. 17:123) 

IDENTIFICATION, CLONING AND EXPRESSION OF M. CATARRHALIS NUCLEIC 
ACIDS 

Expression and purification of the M. catarrhalis polypeptides of the invention can 
be performed essentially as outlined below. 

To facilitate the cloning, expression and purification of membrane and secreted 
proteins from M catarrhalis , a gene expression system, such as the pET System (Novagen), 
for cloning and expression of recombinant proteins in E. coli, is selected. Also, a DNA 
sequence encoding a peptide tag, the His-Tag, is fused to the 3' end of DNA sequences of 
interest in order to facilitate purification of the recombinant protein products. The 3' end is 
selected for fusion in order to avoid alteration of any 5' terminal signal sequence. 

20 PCR AMPLIFICATION AND CLONING OF NUCLEIC ACIDS CONTAINING ORF'S 
ENCODING ENZYMES 

Nucleic acids chosen (for example, from the nucleic acids set forth in SEQ ID NO: 1 
- SEQ ID NO: 2501 for cloning from the 98-4362. strain of M. catarrhalis and plasmids are 
prepared for amplification cloning by polymerase chain reaction (PCR). Synthetic 
25 oligonucleotide primers specific for the S 1 and 3 7 ends of open reading frames (ORFs) are 
designed and purchased from GibcoBRL Life Technologies (Gaithersburg, MD, USA). All 
forward primers (specific for the 5 1 end of the sequence) are designed to include an Ncol 
cloning site at the extreme 5 1 terminus. These primers are designed to permit initiation of 
protein translation at a methionine residue followed by a valine residue and the coding 



10 
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sequence for the remainder of the native M. catarrhalis DNA sequence. All reverse primers 
(specific for the i end of any M. catarrhalis ORF) include a EcoRI site at the extreme 5' 
terminus to permit cloning of each M. catarrhalis sequence into the reading frame of the 
pET-28b. The pET-28b vector provides sequence encoding an additional 20 carboxy- 
5 terminal amino acids including six histidine residues (at the extreme C-terminus), which 
comprise the His-Tag. 

Genomic DNA or plasmid DNA prepared from the 98-4362. strain of M. catarrhalis 
is used as the source of template DNA for PCR amplification reactions (Current Protocols in 
Molecular Biology, John Wiley and Sons, Inc., F. Ausubel et al., eds., 1994). To amplify a 
10 DNA sequence containing an M. catarrhalis ORF, genomic DNA (50 nanograms) is 

introduced into a reaction vial containing 2 mM MgCl2, 1 micromolar synthetic 

oligonucleotide primers (forward and reverse primers) complementary to and flanking a 
defined M. catarrhalis ORF, 0.2 mM of each deoxynucleotide triphosphate; dATP, dGTP, 
dCTP, dTTP and 2.5 units of heat stable DNA polymerase (Amplitaq, Roche Molecular 

1 5 Systems, Inc., Branchburg, NJ, USA) in a final volume of 1 00 microliters. 

Upon completion of thermal cycling reactions, each sample of amplified DNA is 
washed and purified using the Qiaquick Spin PCR purification kit (Qiagen, Gaithersburg, 
MD, USA). All amplified DNA samples are subjected to digestion with the restriction 
endonucleases, e.g., Ncol and EcoRI (New England BioLabs, Beverly, MA, USA)(Current 

20 Protocols in Molecular Biology, John Wiley and Sons, Inc., F. Ausubel et al., eds., 1994). 
DNA samples are then subjected to electrophoresis on 1.0 % NuSeive (FMC BioProducts, 
Rockland, ME USA) agarose gels. DNA is visualized by exposure to ethidium bromide and 
long wave uv irradiation. DNA contained in slices isolated from the agarose gel is purified 
using the Bio 101 GeneClean Kit protocol (Bio 101 Vista, CA, USA). 

25 

CLONING OF M. CATARRHALIS NUCLEIC ACIDS INTO AN EXPRESSION VECTOR 

The pET-28b vector is prepared for cloning by digestion with restriction 
endonucleases, e.g., Ncol and EcoRI (Current Protocols in Molecular Biology, John Wiley 

4 

and Sons, Inc., F. Ausubel et al., eds., 1994). The pET-28a vector, which encodes a His-Tag 



-75- 



Applicant's Docket No.: PATHOi-14 



that can be fused to the 5 end of an inserted gene, is prepared by digestion with appropriate 
restriction endonucleases. 

Following digestion, DNA inserts are cloned (Current Protocols in Molecular 
Biology, John Wiley and Sons, Inc., F. Ausubel et al., eds., 1994) into the previously 
5 digested pET-28b expression vector. Products of the ligation reaction are then used to 
transform the BL21 strain of E. coli (Current Protocols in Molecular Biology, John Wiley 
and Sons, Inc., F. Ausubel et al., eds., 1994) as described below. 

TRANSFORMATION OF COMPETENT BACTERIA WITH RECOMBINANT 
10 PLASMIDS 

Competent bacteria, E coli strain BL21 or E. coli strain BL21(DE3), are transformed 
with recombinant pET expression plasmids carrying the cloned M catarrhalis sequences 
according to standard methods (Current Protocols in Molecular, John Wiley and Sons, Inc., 
F. Ausubel et al., eds., 1994). Briefly, 1 microliter of ligation reaction is mixed with 50 . 

15 microliters of electrocompetent cells and subjected to a high voltage pulse, after which, 
samples are incubated in 0.45 milliliters SOC medium (0.5% yeast extract, 2.0 % tryptone, 
10 mM NaCl, 2.5 mM KC1, 10 mM MgC12, 10 mM MgS04 and 20, mM glucose) at 37^C 
with shaking for 1 hour. Samples are then spread on LB agar plates containing 25 
microgram/ml kanamycin sulfate for growth overnight. Transformed colonies of BL21 are 

20 then picked and analyzed to evaluate cloned inserts as described below. 

IDENTIFICATION OF RECOMBINANT EXPRESSION VECTORS WITH M 
CATARRHALIS NUCLEIC ACIDS 

Individual BL21 clones transformed with recombinant pET-28b M. catarrhalis ORFs 
25 are analyzed by PCR amplification of the cloned inserts using the same forward and reverse 
primers, specific for each M. catarrhalis sequence, that were used in the original PCR 
amplification cloning reactions. Successful amplification verifies the integration of the M. 
catarrhalis sequences in the expression vector (Current Protocols in Molecular Biology, 
John Wiley and Sons, Inc., F. Ausubel et al., eds., 1994). 

30 
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ISOLATION AND PREPARATION OF NUCLEIC ACIDS FROM TRANSFORMANTS 

Individual clones of recombinant pET-28b vectors carrying properly cloned M 
catarrhalis ORFs are picked and incubated in 5 mis of LB broth plus 25 microgram/ml 
kanamycin sulfate overnight. The following day plasmid DNA is isolated and purified using 
5 the Qiagen plasmid purification protocol (Qiagen Inc., Chatsworth, CA, USA). 

EXPRESSION OF RECOMBINANT M CATARRHALIS SEQUENCES IN E COLI 

The pET vector can be propagated in any E. coli K- 1 2 strain e.g. HMS 1 74, HB 1 0 1 , 
JM109, DH5, etc. for the purpose of cloning or plasmid preparation. Hosts for expression 

1 0 include E. coli strains containing a chromosomal copy of the gene for T7 RNA polymerase. 
These hosts are lysogens of bacteriophage DE3 5 a lambda derivative that carries the lad 
gene, the lacUVS promoter and the gene for T7 RNA polymerase. T7 RNA polymerase is 
induced by addition of isopropyl-B-D-thiogalactoside (IPTG), and the T7 RNA polymerase 
transcribes any target plasmid, such as pET-28b, carrying its gene of interest. Strains used 

1 5 include: BL21 (DE3) (Studier, F.W., Rosenberg, A.H., Dunn, J. J., and Dubendorff, J.W. 
(1990) Meth. Enzymol. 185, 60-89). 

To express recombinant M. catarrhalis sequences, 50 nanograms of plasmid DNA 
isolated as described above is used to transform competent BL21(DE3) bacteria as described 
above (provided by Novagen as part of the pET expression system kit). The lacZ gene (beta- 

20 galactosidase) is expressed in the pET-System as described for the M. catarrhalis 

recombinant constructions. Transformed cells are cultured in SOC medium for 1 hour, and 
the culture is then plated on LB plates containing 25 micrograms/ml kanamycin sulfate. The 
following day, bacterial colonies are pooled and grown in LB medium containing kanamycin 
sulfate (25 micrograms/ml) to an optical density at 600 nM of 0.5 to 1.0 O.D. units, at which 

25 point, 1 millimolar IPTG was added to the culture for 3 hours to induce gene expression of 
the M catarrhalis recombinant DNA constructions . 

After induction of gene expression with IPTG, bacteria are pelleted by centrifugation 

in a Sorvall RC-3B centrifuge at 3500 x g for 15 minutes at 4°C. Pellets are resuspended in 

50 milliliters of cold 10 mM Tris-HCl, pH 8.0, 0.1 M NaCl and 0.1 mM EDTA (STE 
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buffer). Cells are then centrifoged at 2000 x g for 20 min at 4°C. Wet pellets are weighed 

1 x> 0 

and frozen at -80 C until ready for protein purification. 

A variety of methodologies known in the art can be utilized to purify the isolated 
proteins. (Current Protocols in Protein Science, John Wiley and Sons, Inc., J. E. Coligan et 
5 al., eds., 1995). For example, the frozen cells may be thawed, resupended in buffer and 
ruptured by several passages through a small volume microfluidizer (Model M-l 1 OS, 
Microfluidics International Corporation, Newton, MA). The resultant homogenate may be 
centrifuged to yield a clear supernatant (crude extract) and following filtration the crude 
extract may be fractionated over columns. Fractions may be monitored by absorbance at 

10 OD28O nm- and peak fractions may analyzed by SDS-PAGE 

The concentrations of purified protein preparations may be quantified 
spectrophotometrically using absorbance coefficients calculated from amino acid content 
(Perkins, S.J. 1986 Eur. J. Biochem. 157, 169-180). Protein concentrations are also 
measured by the method of Bradford, M.M. (1976) Anal. Biochem. 72, 248-254, and Lowry, 

15 O.H., Rosebrough, N., Farr, A.L. & Randall, R.J. (1951) J. Biol. Chem. 193, pages 265-275, 
using bovine serum albumin as a standard. 

SDS-polyacrylamide gels of various concentrations may be purchased from BioRad 
(Hercules, CA, USA), and stained with Coomassie blue. Molecular weight markers may 
include rabbit skeletal muscle myosin (200 kDa), E. coli (-galactosidase (116 kDa), rabbit 

20 muscle phosphorylase B (97.4 kDa), bovine serum albumin (66.2 kDa), ovalbumin (45 kDa), 
bovine carbonic anhydrase (31 kDa), soybean trypsin inhibitor (21.5 kDa), egg white 
lysozyme (14.4 kDa) and bovine aprotinin (6.5 kDa). 



25 EQUIVALENTS 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, many equivalents to the specific embodiments and methods 
described herein. The specific embodiments described herein are offered by way of example 
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only, and the invention is to limited only by the terms of the appended claims, along with the 
full scope of equivalents to which such claims are entitled. 
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ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


5112807_c3_52 


IP 


| |1939 


i p 


279 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


26285938_ri_i 


20 


1940 


207 | 


|S24 | |541 | 


|4.1e-52 


Protein name 








Locus Name 


Acc# 



|sp:YCEG HAU1N 



P44720 



Description 

I HYPOTHETICAL PROTEIN HI 04 5 7 



83 



\jnr inch 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


- - 1 


Ql 
1 


1 11941 
1 1 .. 


1 I 73 1 

II, 1 


1219 1 1125 
L ...J I 


4 .8e 


-08 


Protein name 








Locus Name 




Acc# 










|sp:KTHY_SACSU 




P37537 


Description 














THYMIDILATE KINASE, 


IDTMP 


KINASE) 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


2767080_tl_2 | 


ps 


1 1942 


1 373 1 


1122 | 1522 | 


|4.6e- 


-156 | 


Protein name 








Locus Name 




Acc# 



sp:EFTU_SHEPU 



P33169 



Description 
ELONGATION FACTOR TU (EF-TU) 



ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


32110007_c2_8 


23 1943 


88 


r 


67 | 


114 


7.3e 


-07 


Protein name 








LOCUS 


Name 




Acc# 


nypotneticai 


protein PH148S 






pir:H71023 


H71023 


Description 
















ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


36329582_ci_5 


| 24 | 1944 | 


60 | 


1183 1 


144 


5 . 5e 


-09 


Protein name 








LOCUS 


Name 




ACC# 










sp:YHA2_EIKC0 


P35649 


Description 
















HYPOTHETICAL 


66.3 KD PROTEIN IN UAG2 


5 1 REGION 








i 


ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


971016_tl_l 


|25 | 1945 | 


198 


I 597 1 


643 


|6 .4e- 


-63 


Protein name 








LOCUS 


Name 




Acc# 



sp:EFG_HELPY 



P56002 



Description 
ELONGATION FACTOR G (EF-G) 



84 



ORF Name 



222W12 t3 18 



NT ID 



AAID 



][ 



NT 
Length 




AA 

r — Score 
Length 



Probability 
Tib" | |1.9e-64 ~ 



Protein name 



Locus Name 



glycerophospnoryl diester phosphodiesterase 



] 



:D75630 



Acc# 
D75630 



Description 



ORF Name 



234!>7692 tl 1 



NTID 

]E t = 



AAID 



T9T7" 



NT 
Length 

] F^~~ 



AA 

, — Ll Score Probability 
Length 



1179 | |360 I |1.9e-42 



Protein name 

Description 
RECF PROTEIN 



Locus Name 



sp:RECF_PSEPU 



Acc# 
P13456 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


26042927_t3_19 


28 


1948 


i 84 i 


I 255 1 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


26750837_ri_4 


|29 


| |1949 


i ui 


|336 | |202 | 


|4.9e-16 


Protein name 








Locus Name 


Acc# 


Hypothetical prot 


ein 






"] pir:S76551 


S76551 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length - 


Probability 


3614467SJ:i_2 


po 


1950 


525 


1578 1851 | 


|6.3e-191 



Protein name 

Description 
AMIDOTRkNSFURASL!) (GMP SYNTHETASE) 



Locus Name 



sp:GUAA_HAEIN 



Acc# 
P44335 



85 



ORF Name 



14298443 12 8 



Protein name 



Description 



NTID 



AAID 



NT 



AA 



, r> — m m — _ Score Probability 
Length Length 



j [ 31 | [1951 | [822 | [2469 | [2597 | |5.5e-270 — 

Locus Name Acc# 



sp:GYRB_ECOLI 



P06982 :O08 
438 



DNA GYRASE SUBUNIT 


B, 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|12617<;27_ci__l 


II 32 


| 1952 


128 


1 P 87 1 I 550 1 


|1.2e-63 


Protein name 








Locus Name 


Acc# 

V 


transposase 








pir:IS77S0 


167760 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|34175180_c2_2 


33 


| 1953 


90 


273 137 


1.7e-08 


Protein name 








Locus Name 


Acc# 


transposase 








|gp:AB026428 


AB026428 


Description 












Methylomonas aminoraciens nt>ulose monopnospnate patnway genes (rmpD, rmpA, 
IS10-R rmpl, rmpB) , complete cds . 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|16690875_i:l_2 


34 


| 1954 


82 


|249 | |90 | 


[0.00025 | 


Protein name 








Locus Name 


Acc# 


TolR protein 








gpiPPPALl 


X74218 


Description 












Pseudomonas putida ruvB, 


tOlQ, tOlR, 


tOlA, 


tolB and oprL genes . 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|1953953_c2_15 


IF 


1 l iybb 


534 


J1605 | |1387 | 


|9.3e-142 | 


Protein name 








Locus Name 


Acc# 



sp:AMA_NKlGO 



Q02219 



Description 

MAJOR OUTER MEMBRANE PROTEIN PAN I PRECURSOR 



86 



ORF Name 



122567557 t2 6 



NT ID 

1 F~ 



AAID 



NT 
!T77 



AA 

— , Score 
Length Length 



] |19bfe | |177 | |534 | [ 



Probability 
| |4.be-31 ~ 



Protein name 



Locus Name 



sp:YHDE_BACSU 



Acc# 
007573 



Description 

HYPO T HETICAL 16.6 KB PRO T EIN 1M aLPD-^POVR IN T L1RG E NIC R E GION 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


30544217_t2_8 




| |I957 


1 yj 


P 52 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . . Score 
Length 


Probability 


|4B81S33_i2_7 




| |1958 


ip i 


|189 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


6651712_t2_2 


39 


19S9 


271 


816 611 


l.fie-59 



Protein name 



Locus Name 



isocitrate lyase 



gp:AB0046bl 



Acc# 
AB004651 



Description 



Hyphomicrobium metnyiovorum gene tor isocitrate lyase, morgamcphosphate 
transporter, methionine synthase, complete and partial cds . 



ORF Name 



NTID 



AAID 



14647952 tl 1 



NT AA 
Length Length 
TT2 



2739 



Score Probability 
12108 | |3.7e-218 ~ 



Protein name 



Locus Name 



initiation lactor lF2-aipna 



gp:PVAJ2737 



Acc# 
AJ002737 



Description 

Proteus vulgaris mtB gene and partial nusA ana rJDtA genes. 



87 



ORF Name 



istmam ci ib 



NTID AAID 

I F l [ 



NT 
Length 



AA 



T7T 



] 



Length 



Score Probability 
TT2~ | |4.ie-05 ~ 



Protein name 



Locus Name 



hypothetical protein 



foir:G75410 



Acc# 
G75410 



Description 



ORF Name 



NTID AAID 



21644075 cl 14 



11 



NT 
Length 
TW5 



AA 

— _ Score Probability 
Length 



381 



] e 



|3.7e-35 



Protein name 



Locus Name 



conserved nypothetical protein 



pir:F75410 



ACC# 
F75410 



Description 



ORF Name 



24650277 tl 3 



NTID AAID 

I F I t 



NT 
Length 



AA 

— , Score Probability 
Length *- 



] [ 



|2.&e-!>2 



Protein name 



Description 



Locus Name 



sp:TRUB_HASlN 



Acc# 
P45142 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


33327S0_i2JLl 


1 I 44 


| (1364 


i r i 






Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


3407812_t2_y 


|4b 


|196b 


168 


|507 | 215 


|1.4e-17 | 



Protein name 

Description 
RIBOSOME-BItTOING FACTOR. A (P15B PROTUlN) 



Locus Name 



sp:RBFA_KC!OLi 



Acc# 
P09170 



88 



ORF Name 



1457346^ c l 2 2A 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



j [1955 | |103 | |312 | |171 | 



Protein name 



Locus Name 



conserved Hypothetical protein 



pir:F75410 



Probability 
p.0e-12 

Acc# 
F75410 



Description 



ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



14968825 ±2 5 



[2T7~ 



Protein name 
Description 

I N UTILIZATION SUBSTANCE PROTEIN A 



|bS4 | |456 I 
Locus Name 



Probability 
|3.7e-44 



sp:NUSA_E0OLI 



Acc# 
P03003 



(NUSA PROTEIN) (L FACTOR) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


7070265_t 1_4 




1968 


62 


189 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4143942_i3_l 


49 


1969 


319 | 


|957 | |164 


|l.le 


1 


Protein name 








Locus Name 




Acc# 


nypotnetical protein £>i75y 


pir:G64935 


G64935 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


1072952_£3_19 


50 


| 1970 


331 


|996 | 281 


|2.5e 


1 


Protein name 








Locus Name 




Acc# 



sp:SUG2_YKAST 



Description 

[ PROBABL E PROTEASE SUBUNI T 5UG2 (PROTEASOM AL CAP SUBUNIT) 



P53549:Q08 
718 



89 



ORF Name 



112880 ti 4 



NTID 

]EZ 



AAID 



\TTTT 



NT 
Length 



AA 

T «-u Score 

Length 



Probability 
pOO | [120 | |1.7e-07 



Protein name 



Locus Name 



Acc# 



hypothetical prot 


em APE2554 




pir :C72489 


i C72489 


Description 










ORF Name 


NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


146332<!>0_t2_12 


| |52 | 1972 


1 1 147 i 


r i 




Protein name 






Locus Name 


Acc# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


19532782_c2_33 


53 1973 


1 1^ 


1B42 |1454 | 


|7.4e-149 | 


Protein name 






Locus Name 


Acc# 



sp :TRPE_ACICA 



P23315 



Description 
ANTHRANILATE SYNTHASE COMPONENT I, 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


20939557_ti_i 


54 


1974 


138 | 


I 417 1 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


2207096b_t2_ll 


IF 


| |1975 


1 rn 1 


|372 | |88 | 


|0.018 | 



Protein name 



Locus Name 



alanine- -tRNA ligase, alaS :alanyl-tRNA 
synthetase :alanyl-tRNA synthetase 



pir:D70127 



Acc# 
D70127 



Description 



90 



ORF Name 



NT ID AAID 



23839667 ci 25 



NT 
Length 

] EEZ 



AA 



Length Score Probability 



] ED 



12 . 4e-72 



Protein name 



Description 



Locus Name 



|sp:DAPA_HABIN 



Acc# 
P43797 



D I H YDROD I P I COL I MATE 


SYNTHASE, (DHDP5) 






ORF Name 


NT 

NT ID AAID _ — . , 
Length 


AA 

. — . , Score 
Length 


Probability 


2628I300_c3_3b | 


57 | 1977 | 119 | 


|3S0 | 257 | 


5.1e-22 ] 


Protein name 




Locus Name 


Acc# 



sp:V01B_MVCTU 



Q10514 



Description 
HYPOTHETICAL ' 19.6 KB PRO T EIN CY42 7 .11C 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


3050729l_t3_2O 


58 


1978 


174 


525 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


4792250_ci_26 




1979 


1 114 1 


P 45 1 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|5282805_c3_34 


1 60 


1980 


i p 41 


|726 | 786 | 


4.5e-78 


Protein name 








Locus Name 


Acc# 



sp : PUR7_EC0LI 



P21155 



Description 
(3AICAR SYNTHETASE) 



91 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|24020430_r2_i 




| |1981 


i iav i 


|381 ] |649 | 


|1.5e 


-63 


Protein name 








Locus Name 




Acc# 


transposase 


bir:I57760 


167760 


Description 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability- 


129813_r2_l 


| 62 


| |1982 


1 \ nb 1 


psi | 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4391518J:2_4 




| |1983 


1 1 


|195 [108 | 


|3.2e- 


-06 


Protein name 








Locus Name 




Acc# 



Description 



|sp:THIX_HAEIN 



P43787 



TH I OREDOX I N - L I KE PROTEIN 


Hllllb 










i 


ORF Name NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4495258_t2_2 | |64 


| |1984 


1 1 110 i 


| 512 


4.9e 


-49 i 


Protein name 








Locus Name 




Acc# 


terreaoxin L 3 Fe - 4 s 




pirrFEAV 




A29936 :A00 
218 


Description 












ORF Name NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


4860875_t3_6 65 


| 1985 




480 |204 | 


|2.1e 


-16 


Protein name 








Locus Name 




Acc# 


hypothetical protein APE2447 




pir:F72475 


F72475 



Description 



92 



ORF Name 



NTID AAID 



— — Score Probability 



15677200 Tl 2 



SB" 



NT AA 
Length Length 

] EE ~ ~ 



Protein name 



Description 



[477 | |428 I |3.9e-4U 

Locus Name Acc# 



sp:CYSW_ECOLI 



P16702 :P76 
534 



| SULFATE TRANSPORT 


SYSTEM 


PERMEASE 


PROTEIN OYSW 




ORF Name 


NTID 


AAID 


NT AA 
— — Score 
Length Length 


Probability 


4490S78_ri_l 


1" 


| 1587 


| [247 [741 | 543 


6.4e-63 


Protein name 






Locus Name 


Acc# 



sp:CYSA_ECOLI 



Description 
SULFA T E TRANSPORT ATP-B I NDING PROTEIN CYS5 



P16676:P77 
693 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


16054077_t3_20 




| [1998 


| 520 


1553 




Protein name 








Locus Name 


Acc# 


Description 












[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


16455465_ti_l 




| 1585 


77 


234 | [72 | 


[0.020 


Protein name 








Locus Name 


ACC# 



'sp:YDIE_ECOLI 



P40721 



Description 

HYPOTHE T ICAL 7 . 1 KB PRO T EIN IN AROH-NLPC IN TE RG E N I C R E GION 



ORF Name 
|23485750_c3_35 



NTID AAID 

] r i 



— — Score Probability 



T5W 



NT AA 
Length Length 



[207 | 



Protein name 
Description 



Locus Name 



ACC# 



|JT< 



0-HIT 



93 



ORF Name 



NT ID AAID 



NT AA 
— , — , Score 
Length Length 



Probability 



|23730017_cl_24 | |71 | |1991 


p 47 i 


|2844 | 


278 j 


|2 .2e 




Protein name 




Locus Name 




ACC# 






sp : YTFM 


_HAEIN 




P44038 


Description 












HYPOTHETICAL PROTEIN HI0698 PRECURSOR 


ORF Name NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


23859387_12_14 | |72 |1992 | 


296 


pi | 




|0.048 


Protein name 




Locus Name 




Acc# 


conserved hypothetical protein yrrB 




pir:H5357S 




H69978 


Description 












ORF Name NT ID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


34ii9002_±3_i8 | 73 | J1993 


444 


1335 


714 


l.le 


-69 


Protein name 




Locus Name 




Acc# 


2-acyigiycerophosphoethanoiamine 
acyltransf erase (aas) RP620 


pir:E71667 


E71667 










Description 












ORF Name NT ID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


4480217_c^b | |74 |1994 | 


1675 | 


5028 


678 


1.5e- 


-79 | 


Protein name 




Locus Name 




ACC# 



Description 



sp:YTFN_HAE!N 



Q57523 



HYPOTHETICAL PROTEIN HI0696 | 


ORF Name NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


12378407_c2_32 75 1995 


278 


834 




626 


4.ie-6i 



Protein name 
Description 

PYRIDOXAL PHOSPHATE BIOSYM T HETIC PRO TEIN PDXJ 



Locus Name 



sp : PDXJ_EC0L1 



Acc# 
P24223 



94 



ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



| 144fl79£>2 EI 7 



76 



1 FH [ 



[219 | 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



fO-HIT 



ORF Name 



161402 c2 29 



NTID AAID 
] I 77 1 | 1997 



NT AA 
Length Length 



Score 



] EO [ 



53" 



Probability 
0.018 



Protein name 



Locus Name 



envelope glycoprotein 



:HIVU90070 



Acc# 
U90070 



Description 



HIV-i strain VN15 irom Vietnam, envelope glycoprotein V3 region (env; gene, 
partial cds . 



ORF Name 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



15171905 c2 29 



][ 



73" 



^ [1998 | |57 | |204 | 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



©-HIT 



ORF Name 



NTID AAID 



122324331 t2 16 



] [ 



NT AA 
Length Length 
77 | |234 



Score Probability 



Protein name 
Description 



Locus Name 



ACC# 



'O-HIT 



ORF Name 



NTID AAID 



NT 



AA 



22453311 t3 22 



Length Length 

P 3 I EEO 



Score 



Protein name 

Description 

[NO-HIT 



Locus Name 



Probability 



ACC# 



95 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|23442503_c2_31 




| 2001 | 


346 


1041 | 831 


|7.7e 


-83 


Protein name 










Locus Name 




Acc# 


Era 




gp:AF123492 


AF123492 


Description 














Pseudomonas aeruginosa rnc 


-era-recO 


operon, complete sequence 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|2441278i_c3_34 


|82 


| 2002 | 


101 


BOi 1 






Protein name 










Locus Name 




Acc# 


Description 
















pjO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|2657092b_c2_30 


83 


2003 


268 


|807 | |500 J 


|9.1e- 


-48 


Protein name 










Locus Name 




ACC# 



Description 



sp:RNC_ECOLI 



P05797 :P06 
141 



| RIBOHUCLEASU 111, (RNASE 


111) 












ORF Name NTID 


AAID 


NT 
Length 


AA 
Length 


Probability 


26678567_Cl_24 | 84 


2004 




|192 | |88 | 


(0.00042 


Protein name 








Locus Name 




Acc# 


nypotnetical protein 2 9.1 




pir:S5$084 


S59084 


Description 














ORF Name NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|3516i562_cl_27 85 


| 2005 


| 212 


|639 | |103 | 


[0.0015 | 


Protein name 




Locus Name 




Acc# 



|gp:AP123492 



AF123492 



Description 

Pseudomonas aeruginosa rnc-era-recO operon, complete sequence. 



96 



ORF Name 



14063308 c3 35 



Protein name 



Description 



NTID AAID 



NT AA 
— — Score 
Length Length 



Probability 



j |507 | [1824 | [2257 | | b.9e-234 



Locus Name 



sp:LEM._HAEIW 



Acc# 
P43729 



GTP-BINDING PROTEIN 


LEPA 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|4100003_r3_20 


87 


| 2007 


159 


|480 | 


624 


|6.6e-6I | 



Protein name 



Description 



Locus Name 



sp: Y8 82_HAEIN 



ACC# 
P44068 



HYPOTHETICAL PROTEIN HI0882 




ORF Name 


NTID AAID 


NT 
Length 


AA 

, — ^, Score Probability 
Length u 


| 7032838_c3_36 


88 2008 


367 


1104 276 2.0e-44 


Protein name 






Locus Name Acc# 


signal peptidase 


I 




|gp:EC0K12RIII | D64044 


Description 








| Escnencnia coli 


nbonuclease III 


and other genes, complete cds . 




ORF Name 


NTID AAID 


NT 
Length 


AA 

„ — ^ Score Probability 
Length 


|9869702_i:3_2i 


89 | 2009 


| 60 


r 3 i 


Protein name 






Locus Name Acc# 


Description 








NO-HIT 




ORF Name 


NTID AAID 


NT 
Length 


AA 

Length Score Probability 


|10802330_t3_20 


90 2010 


64 


195 | 


Protein name 






Locus Name Acc# 



Description 
NO-HIT 



97 



ORF Name 



NT ID AAID 



12714056 cl 22 



TuTT" 



NT AA 

— , — ^, Score 

Length Length 

T77 



Probability 



Protein name 



[1134 | |1472 | |9.1e-lSi — 
Locus Name Acc# 



putative tormaidenycte aenyarogenase 



gp:&SP24394i 



AJ243941 



Description 



1 Pseudomonas sp. strain HRiyy partial vanB, tdh, gcs, enyA ana enyBgenes. 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


14844626_c2_34 


|92 


| 2012 


202 


I 609 1 r 1 


|0.028 


Protein name 








Locus Name 


Acc# 


transcription 


regulator, 


TetR ramiiy 




pir:F75482 


F75482 


Description 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


15705056_cl_24 




| 2013 


r 2 i 


pit | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT j 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


1596<57_c2_31 


94 


2014 


" i 


P 04 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT | 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

r — , i Score 
Length 


Probability 


30079512_t3_17 


pk 


2015 


\ u I 


|231 | |87 | 


|0. 00053 | 


Protein name 








Locus Name 


Acc# 



sp : FIXS_RHIME 



P18399 



Description 
NITROGEN FIXATION PROTEIN FIXS 



98 



ORF Name 
|3SS78402_frT" 

Protein name 



NT ID 



AAID 



NT 



AA 



Length Length 



Score Probability 



] | 96 | I 2016 | |441 | |1325 | [1115 | |4.Be-li3 

Locus Name Acc# 
sp:YEEF_ECOLI | P33016 



Description 

I HYPOTHETICAL 49.8 KB T RANSPORT PROTEIN IN SBCB-HISL INTERG ENIC REGION 



ORF Name 
3910876 Fl 16 



NTID 



AAID 



[277T7~ 



NT AA 
Length Length 

] EEEI 



Score Probability 
|2.Ie-73 



ITT 



Protein name 



Description 



Locus Name 



sp:YDIU_ECOLI 



Acc# 

P77649:P76 
904 



HYPOTHETICAL 54 


.4 KD PROTEIN IN AROH-NLPC INTERGENIC REGION 




ORF Name 


NT AA 
NTID AAID — , . — . , Score 
Leny Lh Leny Lh 


Probability 


5097812_c2_^0 


| 98 2018 | |120 |353 | |280 | 


|1.9e-24 



Protein name 



Description 



Locus Name 



sp:YAIM_E0OLl 



ACC# 

P51025:P77 
317 



HYPOTHETICAL 31 


4 KD PROTEIN IN MHPT-ADHC INTERGENIC 


REGION 




ORF Name 


NT AA 

NTID AAID _ _ 

Length Length 


Score 


Probability 


S21031BJt2_10 


99 2019 | 289 |870 | 


196 


1.5e-15 | 



Protein name 
nypotnetical protein HP0861 



Locus Name 



|pir:E64627 



Acc# 
E64627 



Description 

ORF Name 
|6740S77Jf53IB" 



NTID AAID 



NT AA 

— , , — J _ 1 Score Probability 
Length Length 



TuTT 



27JTU - 



TTZT 



i.7e-62 



Protein name 

I stearoyl-CoA desaturase 



Locus Name 



|gp:AF026401 



Acc# 
AF026401 



Description 

Mucor rouxn stearoyl-CoA aesaturase (Oiel) gene, complete cds. 



99 



ORF Name 



1994001 cl 23 



NTID 

]EEI 



AAID 



NT 



AA 



Length Length 



Score Probability 



\TUTT 



Protein name 



Description 



] | 176 | | 531 | |573 | | 1.7e-b5 — 

Locus Name Acc# 



sp:YEIG_ECOLI 



P33018 



HYPOTHETICAL 31.3 KB PROTEIN IN FOLE-CIRA 1NTE RGENIC REGION 



ORF Name 



11048137 c3 6b 



NTID AAID 

] I 102 I P 77 " 



NT AA 
— — Score 

Length Length 



Protein name 



Description 



Locus Name 



Probability 



Acc# 



'O-HIT 



ORF Name 



NTID AAID 



110585925 tl 2 



TUT 



TUTT 



NT AA 
Length Length 
73 



— . , Score 



TTT 



Protein name 
Description 
[NO-HIT 



Locus Name 



Probability 



Acc# 



ORF Name 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



114885910 c2 SI 



TUT 



TUTT 



TZW 



71 



Probability 
0.026 



Protein name 



Locus Name 



TaglT 



|gp:AF013775 



Acc# 
AF013775 



Description 



Salmonella typhimurium PagK (pagK) , 
complete cds . 



PagM (pagM) , and PagO (pagO) genes, 



ORF Name 



122554587 c3 57 



NTI D AAID 



] 



2025 



NT AA 
Length Length 
] |480 



TV5 



Score Probability 
|480 | 



|1.2e-45 



Protein name 



Description 



Locus Name 



sp:SMPB_ECOLI 



ACC# 

P32052 :P77 
Oil 



SMALL PROTEIN U (18.3 KD PROTEIN) 



100 



ORF Name 



123437838 t3 28 



NT ID 

dee: 



AAID 



] [2026 | |725 | 



NT 
n 



AA 

— Score 
Length Length 



TTTW 



Probability 
[1584 | | J.le-iV3 ~ 



Protein name 



Description 



Locus Name 



IsprDNLJJIAEIN 



Acc# 
P43813 



) 


ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|23468813_t3_22 


| |107 | |2027 


309 | 


p" i 


p 54 i 


|6.2e-26 | 



Protein name 



Locus Name 



putative permease BitE 



gp:SHU75349 



ACC# 
U75349 



Description 

serpuiina hyodysentenae bit operon, complete sequence. 



ORF Name NTID AAID . — , 

Length 


AA 

t — i i Score 
Length 


Probability 


234807_ti_ii 108 2028 175 


528 456 


4.2e-43 | 


Protein name 


Locus Name 


Acc# 


iipopolysaccharide core biosynthesis protein 
kdtB homo log 


pir :S72166 


~"| S72166 






Description 






ORF Name NTID AAID — , 

Length 


AA 

t — , i Score 
Length 


Probability 


|23705040J:2_i6 | |109 | |2029 | |360 | 


|1083 |570 | 


|u.tie-&<> | 


Protein name 


Locus Name 


ACC# 




sp:POTA_HAEIN 


J P45171 


Description 






SPERMIDINE/ PUTRESCINE TRANSPORT ATP -BINDING PROTEIN POTA 


ORF Name NTID AAID _ — , 

Length 


AA 

, — , , Score 
Length 


Probability 


|23726587_t2_17 | |110 | |2030 | |335 | 


|1008 | |745 | 


|9.9e-74 | 


Protein name 


Locus Name 


Acc# 


conserved hypothetical protein yddN 


pir :F69776 


| F69776 



Description 



101 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|23884387_cl_37 


||1U 


1 2031 


1 F iy 1 


po | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


2425230b_t2_19 


| 112 


| 2032 


| 298 | 


|897 | 155 | 


8.2e-Q9 | 


Protein name 








Locus Name 


ACC# 



Description 



sp:YDFC_BACSU 



P96680 



HYPOTHETICAL 33.6 


KD PROTEIN IN CSVC 


-NAP INTERGENIC REGION 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


24797312_tl_4 


|iu 


2033 


275 


|828 | 132 | 


|5.0e-05 | 


Protein name 










Locus Name 


Acc# 


hypothetical protein PH1114 






pir:C71052 


C71052 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|25901467_C3_b4 


I 114 


2034 | 


88 | 


1267 1 




Protein name 










Locus Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


30272051_t3_24 


1 llb 


2035 | 


248 1 


|747 | |169 


2.2e-ll 


Protein name 










Locus Name 


ACC# 


probable morphological 
differentiation-associated protein 


pir:T35579 


T36679 









Description 



102 



ORF Name 


JN I ID 


NT 

7i7VTn 

Length 


. , Score 
Length 








| (2036 | |286 | 


|851 | |232 


|2.3e-19 | 


Protein name 






Locus Name 


Acc# 


permease protein 






gp:CJAJ750 


AJ000750 


Description 


Campylobacter jej 


uni malF 


gene, partial. 






ORF Name 


NT ID 


NT 

AAID — 

Length 


AA 

t — . i Score 
Length 


Probability 


35976510_r3_30 


117 


|2037 | 89 


|270 | |343 | 


4.0e-3I | 


Protein name 






Locus Name 


Acc# 








pinFEKRV 


1 S72167:S78 


Description 








121:A00210 



ORF Name 



NT ID 



AAID 



NT 



AA 



36383542 tl 13 



12038 



— , — , Score 
Length Length 

TUT 



i ed en I 



Probability 
5.9e-05 



Protein name 



Locus Name 



KH type splicing regulatory protein 



E 



p:HSKH<5RP3 



Acc# 
AF093747 



Description 



Homo sapiens 
partial cds . 


KH type splicing regulatory protein (KHSRP) gene 


, exon2 and 


ORF Name 


NT AA 
NT ID AAID — . , . — . , Score 
Leny Lh Leny Lh 


Probability 


3923288_ci_39 


119 | |2039 343 | 1032 |254 | 


l.le-21 | 



Protein name 



Locus Name 



probable regulatory protein (ptoS/R) 



|pir:E71373 



ACC# 
E71373 



Description 

ORF Name 
3938393 c3 64 



NT ID 

IF" 



AAID 



AA 

— Score 
Length Length 



][ 



NT 



|218 | |657 | 



Probability 
l.Oe-71 



Protein name 



Locus Name 



uracil phospnoribosyl transferase , upp 



pir :A6b02b 



Description 



Acc# 

A65026:S23 
412 



103 



ORF Name 



4064638 tl 3 



NTID 



\TTT 



AAID 



NT 
Length 

] F 71 



AA 

Le ~ th Score Probability 



Protein name 



Description 



[1115 | |152 | | 4.Ve-0B ~ 

Locus Name Acc# 



spTYTJTTISETTr 



P43951 



| HYPOTHETICAL 


PROTEIN HI013I PRECURSOR 






ORF Name 


NT 

NTID AAID — 

Length 


AA 

— Score 
Length 


Probability 


|4101568_t3_29 


122 2042 | |263 


|792 | |512 | 


|4.9e-49 


Protein name 




Locus Name 


ACC# 



sp:PRP_VIBHA 



Q56691 



Description 
(NaDPH-FMN OXIDORKDUCTASE) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


p264i_ci_33 


123 


2043 


86 


|261 


1 I 100 1 


|2 .2e 


| 


Protein name 








Locus Name 




Acc# 


| nypotnetical prot 


ein PH0217 




pir: 


G71244 




G71244 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|I0790_£3_68 




|2044 


| 731 


2196 


1 P 4 1 


|9.0e 


-86 | 


Protein name 








Locus Name 




Acc# 










sp:PRIM_HAEIN 




Q08346 


Description 
















DNA PrIMASE, 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|1190i2_c3_ii8 


125 | 


2045 


1 


|1317 


j |1830| 


l.le- 


-188 



Protein name 

Description 
HYPOTHETICAL PROTEIN HI0125 



Locus Name 



sp:VJCD_HAEIN 



ACC# 
P44530 



104 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length ■■ - 


Probability 


I22I4386_c3_iI7 


126 


2046 


P b 1 


i 378 i 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|12540957_C3_121 


i r 7 


j |2047 


i p« i 


|843 | |227 


|7.7e-19 | 


Protein name 








Locus Name 


ACC# 


probafcle ytiH protein 






pir :AV0b'/y 


| A70579 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — Score 
Length 


Probability 


I2S93961_r2_35 




|2048 


1 ^ 1 


|207 | 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

r — , i Score 
Length 


Probability 


19532813_r3_73 


IP 


2049 


| 134 


405 252 


1.7e-21 | 


Protein name 








Locus Name 


ACC# 


RpsT protein 








gp : VCNHAR 


t AJ002395 


Description 


Vibrio cnolerae nhaR, hlyU, mviN, 


and rpsT genes. 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


209375_cl_95 


|130 


| |2050 


| 750 


2250 |1867| 


|1.3e-i92 | 


Protein name 








Locus Name 


Acc# 



sp:CLPA_E<L'OLI 



Description 

A T P-DEPEMUKNT I'LP PROTEAN ATP-B1MUIMS 5UBUNI T I ' LPA 



P15716 :P77 
686 



105 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — , , Score 
Length 


Probability 


|21641077J:2_49 


131 


| |205I 


1 1 1 " 


| pO | |132 


| |4.3e-08 


Protein name 








Locus Name 


Acc# 


1 hypotnetical prot* 


em 






gp:SYCSLLE 




Description 










1 D64003:AB0 

01339 


Synecnocystis sp. 


PCC6803 


complete genome, 


22/27, 2755703-2868766. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|22143827_cl_89 


|132 


| 2052 


|250 


| |753 | 246 


7.5e-2i 


Protein name 








Locus Name 


ACC# 



Description 



sp:YIV8_YEAST 



P40582 



HYPOTHETICAL 26.8 KD PROTEIN IN HYR1 


3 1 REGION 


i 


ORF Name NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|22453453_c2_104 | |i33 |2053 | 


426 | 


|1281 | |492 | 


6.4e-47 | 


Protein name 




Locus Name 


Acc# 


| carboxyl- terminal proteinase 




pir:E70369 


F70369 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|22831262_ci_94 | |134 | |2054 | 


128 | 


|387 | |185 | 


|2.2e-14 | 


Protein name 




Locus Name 


Acc# 



sp:YLJA_ECOLI 



P75832 



Description 

12.2 KD PROTEIN IN CSPD-CLPA INTERGENIC REGION 



ORF Name 
|23632215_r2_59 



NTID 



AAID 



■27T5T5" 



NT AA 
— — Score 

Length Length 



ED 



Protein name 
Description 
(NO-HIT 



Locus Name 



Probability 



Acc# 



106 



ORF Name 



23545875 cl 84 



Protein name 



Description 



NTID 



AAID 



][ 



NT 

2n 

] P? 



AA 

— Score 
Length Length 



Probability 



j [lfllfl | |723 | |2.1e-71 ~ 

Locus Name Acc# 



sp :CYDD_ECOLI 



P29018 :Q47 
656:P77275 



TRANSPORT ATP -BINDING PROTEIN CYDD 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


23875303_c2_109 


|137 


| 2057 


p 2 i 


|219 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24236642_cl_91 


138 


| 2058 


350 


1053 695 


2.0e 


-68 


Protein name 










Locus Name 




Acc# 












sp:RLUD_ECOLI 


P33643:P77 
003 


Description 














(PSEITOOURIDYLATU 


SYNTHASE) 


(URACIL HYDROLYASE) 






1 


ORF Name 


NTID . 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24317757_£3_67 


139 


2059 


368 


1107 352 


4.4e 


1 


Protein name 










Locus Name 




Acc# 












sp : YPI Y_PSEAE 


P33641 


Description 
















HYPOTHETICAL 38.5 


KD LIPOPROTEIN IN 


PILS 5 1 REGION PRECURSOR 


(ORPY) 


1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

Length SC ° re 


Probability 


|24318805_t2_60 


| 140 


2050 


205 


|618 | |229 | 


4.8e- 


1 


Protein name 










Locus Name 




ACC# 


hypothetical prot 


ein 








gp:ASA224767 


AJ224767 



Description 
Acinetobacter sp. ADP1 Ion gene and ORFs . 



107 



ORF Name 



NT ID AAID 



24417012 r2 b'A 



141 | [2051 | [ 



NT AA 
Length Length 
TT2 1 — 



Score 

] ED 



Probability 

o.oii 



Protein name 



Locus Name 



LpsB 



gp:AF153023 



Acc# 
AF193023 



Description 



Sinorhizobium melilotx GreA igreAj , LpsB UpsB) , LpsE UpsE) , LpsD(ipsD), | 
LpsC (IpsC), and Lrp (lrp) genes, complete cds. 


ORF Name NTID AAID 


Score Probability 


f2'4S50052_c3_ii9 | 142 | |20<52 | 234 | |70b 


148 3.ie-0§ | 



Protein name 



Locus Name 



nypotnetical protein c^fiu . 3 



pir:T15745 



Acc# 
T15745 



Description 



ORF Name 



2468550 c3 123 



NTID 



AAID 



Score 



][ 



Protein name 

Description 
COPPER-TRANSPORTING ATPA5E, 



NT AA 
Length Length 

] p — i f^~i m 



Probability 
|1.4e-05 



Locus Name 



sp:COPAJJELFE 



Acc# 
032619 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


25351541_c2_ii6 


| 144 


| 2064 


| 258 j 


|857 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|2616^S_tl_i6 


| 145 


|2065 


217 


|554 | |603 | 


|l.le-58 | 



Protein name 



Locus Name 



response regulator GacA 



IgpiAFllbm 



Acc# 
AF115381 



Description 



Pseuaomonas aureotaciens iU-84 response regulator GacA (gacA) gene, complete 
cds . 



108 



ORF Name NTID AAID 


NT 
Length 


_ — . , Score 
Length 


P r obab i 1 i ty 


31431512_i:i_22 | 146 |2066 | 


1182 
1 


1549 1 1295 
1 1 1 


|4.ae-26 | 
1 1 


Protein name 




Locus Name 


Acc# 


oacterioterritin comigratory protein 




pir :F71971 


F71971 


Description 








ORr Name n i ijj aaijj 


NT 
Length 


AA 

. i Score 
Length 


f X vJJJcIjJ J. _L 1 Ly 


31832188 c2 114 1 147 1 12067 1 
- - 1 II 1 


440 1 
1 


11323 1 11025 1 
1 III 


|2.1e-103 
1 


Protein name 




Locus Name 


Acc# 






sp:Y290_HAEIN 


] P77868 


Description 








PROBABLE CAT I ON - TRANS PORT I NG AT PAS E HI0290, 


ORF Name NTID AAID 


NT 
Length 


AA 

T — . , Score 
Length 


Probability 


|33845302_c2_115 148 |2068 | 


288 


867 653 


|5.6e-64 | 


Protein name 




Locus Name 


ACC# 






sp:Y290_HAEIN 


1 P77868 


Description 








PROBABLE CAT I ON - TRANS PORT I NG AT PAS E HI0290, j 


ORF Name NTID AAID' 


NT 
Length 


AA 

— , Score 


Probability 




Length 




|35974750J:2_38 | |145> | |2069 


261 | 


p. | P i | 


|i.ie-58 | 


Protein name 




Locus Name 


ACC# 






|sp:YBGI_HAETN 


1 Q57354 :O05 


Description 






008 


HYPOTHETICAL PROTEIN HI 010 5 


ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


4806512_c2_96 | |150 |2070 | 


463 


|1392 | |1501 | 


|7.7e-154 j 


Protein name 




Locus Name 


Acc# 


hypothetical protein 7 




pir:T00129 


T00129 



Description 



109 



ORF Name 


In 1 ±U ±\J-\±LJ 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


5109843_c2_99 


I 1151 1 12071 

II II 


1 I 575 1 
1 1 1 


JI740 |291 


|4.3e 


1 


Protein name 








Locus Name 




Acc# 










sp:CYDC_ECOLI 


P23886 


Description 














TRANSPORT ATP- 


-BINDING PROTEIN CYDC 












ORF Name 


NT ID AAID 


NT 
Length 


„ — . , Score 
Length 


Probability 


|6718_C2_103 


152 2072 


1 531 1 


1596 | i457 | 


13 .5e- 


-149 | 



Protein name 



Description 



Locus Name 



sp:PMGI_ECOLI 



Acc# 
P37689 



(EC 5.4.2 


.1) 


( PHOS PH0GLYCER0MUTA5E ) 


(BPG- INDEPENDENT 


PGAM) 




ORF Name 




NTID AAID 


NT AA 
Length Length 


Score 


Probability 


|6837753_ti_ 


23 


| 153 2073 


224 1 F 5 1 


I 147 1 


3.2e-08 | 



Protein name 



Locus Name 



capm protein ( capMl ) RP344 



bir:B71591 



Acc# 
B71691 



Description 

ORF Name 
| 7 89811_cl^ " 



NTID AAID 



NT AA 
— — Score 
Length Length 



Probability 



12074 



Protein name 



Description 



552 | [2579 | [2203 | |2.3e-256 

Locus Name Acc# 



sp :GYRA_ECOLI 



P09097 



DNA GYRASE 


SUBUNIT A, 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|586638_c3_5 


I 1 " 


| 2075 


| 262 | 


|789 


[1149 | 


|i.5e-II6 



Protein name 



Locus Name 



multidrug transporter nomolog 



pir :G69005 



Acc# 
G69005 



Description 



110 



ORF Name 



N'T ID 



AAID 



12985037 c2 42 



NT 
Length 



AA 



Le ~ t ^ Score Probability 



] ED ED] I 



|1.9e-31 



Protein name 



Description 



Locus Name 



sp:PILO_PSEAE 



Acc# 
P34750 



| FIMBRIAL ASSEMBLY 


PROTEIN 


PILQ PRECURSOR 






ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|14301467_c3_49 


| 137 


|2077 


| 231 


|696 | |3I6 | 


|2.9e-28 


Protein name 








Locus Name 


Acc# 


carbonic anhydrase 








pir:D75298 


D75298 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


1557_cl_31 


|158 


2078 


| 501 


1505 1393 


|2.ie-i42 



Protein name 
Description 

I HYPOTHETICAL PROTEIN HI0019 



Locus Name 



Acc# 



sp : YLEAJHAEIN ] Q57163 



ORF Name 
|19gi5587JH~T" 



NTID 
|159 



AAID 



NT 
Length 



AA 

t — fc u Score 
Length 



ST 



Protein name 



Description 



Locus Name 



Probability 



Acc# 



'O-HIT 



ORF Name 



NTID 



AAID 



23445308 r2 18 



TO^TT 



NT 
Length 

1223 



AA 
Length 

|672 



Score Probability 



Protein name 
Description 
[NO-HIT 



Locus Name 



Acc# 



111 



ORF Name 



NT ID AAID 



23859562 ci 32 



][ 



NT AA 
Length Length 
] |702 



Score Probability 
J [140 | [1.7e-09 



Protein name 



Locus Name 



pllus expression protein 



gp : PSEPONA 



Acc# 
L28837 



Description 



Pseudomonas syringae penicillin binding protein (ponA) , membraneproteins 
(pilN, pilO) , pilus expression proteins (pilM, pilP) genes, complete cds and 
pilus expression protein (pilQ) gene, partial cds. 



ORF Name 



24040911 cl 33 



NTID 



TUT 



AAID 



] [ 



NT AA 
— — Score 
Length Length 



] i 



Probability 
6.3e-30 



Protein name 



Description 



Locus Name 



spiPILQJPSEAE 



Acc# 
P34750 



FIMBRIAL ASSEMBLY 


PROTEIN 


PILQ PRECURSOR 






i 


ORF Name 


NTID 


NT 

AAID T — ^ 
Length 


_ — . , Score 
Length 


Probability 


34510950_c2_39 


| 163 


| 2083 645 | 


|1938 | |20i | 


4.9e 


-15 


Protein name 






Locus Name 




ACC# 


membrane protein 


|gp : PSEPONA 


L28837 



Description 



Pseudomonas syringae penicillin binding protein (ponA) , membraneproteins 
(pilN, pilO) , pilus expression proteins (pilM, pilP) genes, complete cds and 
pilus expression protein (pilQ) gene, partial cds. 


ORF Name NTID 


_ _ ___ NT AA 
AAID _ — . „ _ — . , Score 
Length Length 


Probability 


34589061_cl_36 | 154 


|2084 | |183 | |552 | 


36b 


|i.8e-3^ 



Protein name 



Locus Name 



lactoylglutatnione lyase, iglyoxalase I 



pir:A46714 



Description 



Acc# 

A46714 :A46 
623 



112 



ORF Name 



1 4304693 cl 34 



Protein name 



NT ID 
|165 



AAID 



NT AA 

— — Score 

Length Length 

T7F 



[1126 | [ 



TOT" 



Probability 
2.4e-88 



Locus Name 



Acc# 
050468 



Description 
3 -DEHYDROQUINATE SYNTHASE, 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


4877328_cl_35 


166 


| 2086 


318 | 


I 957 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 1 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


7042153_c2_43 


167 


| 2087 


231 | 


|696 | 452 


l.le-42 


Protein name 








Locus Name 


ACC# 



Description 



sp : AR0K_HAE IN 



P43880 



5HIKIMATE KINASE, 


(SKJ 












ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


7083457_c3_46 


| 168 | |2088 


pit | 


| |lb4 | 


|4 ,2e 


-11 


Protein name 








Locus Name 




Acc# 


limbnal assembly protexn piio 




pir :S77728 


S77728 


Description 














ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length - ■■ - 


Probability 


23703142_cl_3 


||i<59 2089 


poo | 


900 |635 | 


|4.5e 


-62 | 


Protein name 








Locus Name 




ACC# 



sp : VJtlK_t!C'OLi 



P39280 



Description 
HYPOTHETICAL 3B.7 



KD PROTEIN IM MOPA-EFP IUTEEGENIC kUUlON 



113 



ORF Name 



134119052 £1 I 



NT ID 

]ezh 



AAID 



AA 

— — Score 
Length Length 



[TuW 



NT 
|n 

] EE! 



Protein name 



translation elongation t actor ef-p 



Description 



Locus Name 



] |pir:S34443 



Probability 
Acc# 



S34443:S56 
375:A65225 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


p2712915_c2_17 


|171 


| |209i 


1 1" 


i" 4 i 




Protein name 








Locus Name 


ACC# 


Description 












pJO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|3398470i_r3_10 


172 


1 | 2092 


579 


1740 1233 


1.9e-125 


Protein name 








Locus Name 


Acc# 



sp :PMSR_NEIGO 



P14930 



Description 

PEPTIDE METHIONINE SULFOXIDE REDUCTASE (PEPTIDE MET (0) REDUCTASE) 



ORF Name 



NTID AAID 



135131500 c3 21 



fTTT 



][ 



NT AA 
— — Score 

Length Length 

tee 1 mr 



5T5~ 



Protein name 



Description 



Locus Name 



Probability 
|3.4e-64 

ACC# 



sp:HTPX_ECOLI 



P23894 



PROBABLE PROTEASE HTPX, (HEAT SHOCK PROTEIN HTPX) 


NT AA 

ORF Name NTID AAID _ — . , , — 

Length Length 


Score 


Probability 


|3907578_cl_I5 | 174 | |2094 299 900 


572 | 


2 .le-b5 



Protein name 

Description 
PYROPHOSPHORYLASE) 



Locus Name 



sp:DHPS_ECOLI 



Acc# 

P26282 :P78 
110 



114 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability- 


|4100312_r3_13 


1 l 17b 


| |2095 


i r 4 i 


P i | 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


N 1 

Length 


AA 

. — . , Score 
Length 


Probability 


pa2ao^_c2_i6 


| 175 


| |209(!> 


1 115 1 


|348 | |223 | 


|1.2e 


-16 


Protein name 








Locus Name 




Acc# 


probable transglycosylase 






pir :T12796 




T12796:A69 
911 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|B3131B_i2J7 


| 177 


| |2097 


472 


J1419 | 1225 | 


|1.4e 


-124 


Protein name 








Locus Name 




Acc# 



Description 



sp:HFLX_ECOLI 



P25519 



GTP-BINDING PROTEIN HPLX 


ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


870250_t2_6 | 178 | 2098 


1 1 


|768 | 394 


|l.Se-3S 


Protein name 






Locus Name 


ACC# 


nypotneticai protein in endA-gsnB 
region 


mtergenic 




pir :A65080 


A65080 










Description 










ORF Name NTID AAID 


NT 
Length 


AA 

T — . , Score 
Length 


Probability 


|10548386__t2_19 | 179 |2099 


647 


1944 | 




Protein name 






Locus Name 


ACC# 



Description 



sTO-HIT 



115 



ORF Name 



106265S8 c2 94 



NTID AAID 

i 



[2TW 



NT 
Length 
TF2 



AA 

T — 4-1. Score 
Length 



] EZZ! EO [ 



Probability 



Protein name 



Description 



Locus Name 



|sp:TEGP_HSVii 



Acc# 
P06481 



T EGUMEN T PH05PH0PR0 T EIN US 9 (10 KB PROTEIN) 



ORF Name 



NTID 



AAID 



1178127 11 10 



NT 
Length 

] E= 



AA 

T — i-o- Score 
Length 



Protein name 



Description 



[1338 | [1319 | 

Locus Name 



Probability 
|l.Se-134 

ACC# 



|sp:SYS_HAEIN 



P43833 



5ERYL-TRNA SYNTHETASE, (SERINE- -TRNA LIGA5EJ (5ERRS) 



ORF Name 



12109585 ci 63 



NTID 



AAID 



TTUT 



NT 
Length 





AA 

_ — Score 
Length 



HUT 



Protein name 



Description 



Locus Name 



Probability 



ACC# 



NO-HIT 



ORF Name 



12892086 12 26 



NTID 

I F 7 " 



AAID 



][ 



IZTUT" 



NT 
Length 

] E^Z 



AA 

T — ^ Score 
Length 



[2T5~ 



Protein name 

Description 

NO-HIT 



Locus Name 



Probability 



ACC# 



ORF Name 



NTID 



AAID 



1369428 c2 97 



184 



3i 



NT 
Length 

r 5 — i 



AA 

T — -u Score 
Length 



Protein name 
Description 
[NO-HIT 



EZZZI 

Locus Name 



Probability 



Acc# 



116 



ORF Name 



|I37i0925_t3_46 



Protein name 



Description 



NT ID 

]EEI 



AAID 



NT 



AA 



— . , _ — . , Score 



Length Length 



Probability 



][ 



TT5" 



EEZI 



] [ 



7.1e-64 



Locus Name 



sp:MT!C_MORBO 



Acc# 
P34721 



ME TH YLTRAN5 FERAS E MB01 C) 


IM.MBOI C) 






ORF Name NTID 


AAID — 

Length 


AA 

_ — . , Score 
Length 


Probability 


1412642_cl_65 186 


2106 |147 | 


|444 | |8 8 | 


|0. 00042 


Protein name 




Locus Name 


Acc# 



sp:YRKI_BACSU 



P54436 



Description 

HYPOTHE T ICAL 8.2 KB PEOTEIK IN BL T R-SPOIIIC IM'ERGENIC REGION 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|14250312_c2_100 


187 


pilT7 [ 


246 


741 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|1433466_c2_lll 


ir 


| |2108 | 


r i 


|258 | |141 | 


|9.5e-09 


Protein name 








Locus Name 


Acc# 



Description 



sp:MVIM_ECOLI 



P75932 



VIRULENCE FACTOR 


MVIN HOMOLOG 








ORF Name 


NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|14875390_t3_5i 


189 2109 


134 


|405 | |302 | 


|i.0e-26 


Protein name 






Locus Name 


ACC# 



sp : YAt!L_L!COLI 



P37764 



Description 

HYPO T HE T ICAL 49. 1 KB PRO T EIN IN CDSA-HLPA IN TE RG E NIC REGION 



117 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


i5020887_ci_83 


| 190 


| |2110 


189 


IbVO 1 






Protein name 










Locus 


Name 


Acc# 


Description 
















NO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


15885450_c3_142 


191 


2111 


| 342 


|1029 | 


|624 


|6.6e-6i | 


Protein name 










LOCUS 


Name 


Acc# 












sp:MVIN_HAEIN 


| P44958 


Description 
















VIRULENCE FACTOR 


MVIN HOMOLOG 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


15S043_tl_8 


| 192 


| 2112 


259 


|780 | 


416 


7.3e-39 | 



Protein name 



Locus Name 



cytochrome c maturation protein B 



gp:AF044bb2 



ACC# 
AF044582 



Description 



Shewaneiia putretaciens NrtG homolog gene, partial cds; anamono-heme c- type 
cytochrome ScyA (scyA) , cytochrome c maturationprotein A (ccmA) , cytochrome c 
maturation protein B (ccmB) , cytochrome c maturation protein C (ccmC) , 
cytochrome c maturationprotein D (ccmD) , and cytochrome c maturation protein 
E (ccmE) genes, complete cds. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


170<=>9628_ri_4 


193 


2113 


l m 1 


I 351 1 




Protein name 








Locus Name 


ACC# 


Description 












NO- HIT j 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


18770i_t2_21 


194 


2114 


113 


342 90 


0.00026 


Protein name 








Locus Name 


ACC# 



sp:Y4AR_RHiyN 



P55365 



Description 
HYPO T HETICAL 12 . 1 KB PKPTETK ¥4Ak 



118 



ORF Name 



22462757 ci 67 



NT ID AAID 

]EZI 



NT AA 
— — Score 
Length Length 



1EEO [ 



Protein name 



] EE!ZI EZZI 

Locus Name 



Probability 
[0.00033 

ACC# 



Hypothetical protein SC6E10.02 



pir :T35489 



T35489 



Description 



ORF Name 



NT ID AAID 



NT AA 
— — Score 
Length Length 



| 23470003_clJ31 | |196 | [2116 | | |46S | 



3T5~ 



Protein name 

Description 
VIRULENCE FACTOR MVIN H0M0L0G 



Locus Name 



Probability 

|2.3e-31 

Acc# 



|sp:MVIN_ECOLI 



J 



P75932 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|239i40i7_c2_i04 




| |2ii7 


i r i 


257 134 


|5.5e-09 | 


Protein name 








Locus Name 


Acc# 


hypothetical protein yciaT 






|pir:CS9770 


| C69770 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


24219792_t2_34 


(pa 


| [2118 


1 V 1 


|89i | |440 | 


|2.1e-4i 


Protein name 








Locus Name 


Acc# 










sp:CDSA_PSBAE 


Q59640 


Description 












SYNTHASE) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — t , Score 
Length 


Probability 


24244033_cl_70 


ir 


| |2ii9 


302 | 


909 | |£20 | 


|l.Se-60 



Protein name 
Description 

HYPOTHETICAL 34.1 KD PROTEIN IN GLNA 3' REGION 



Locus Name 



sp:YGLA_SYNP2 



Acc# 
P28606 



119 



ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



Probability 



|24252302_c2_105 
Protein name 



[ 200 | [ 2120 | |493 | [1482 | ^7TT\ | b.le-12b 

Locus Name Acc# 



2-oxogiutarate/maiate transiocator nomolog 
yflS 



pir :F69811 



F69811 



Description 



ORF Name 



NTID 



AAID 



12433000b c3 122 



TUT 



TTTT 



NT AA 
Length Length 



Score 



] 



Probability 
l.Se-38 



Protein name 



Description 



Locus Name 



EE 



:AB017194 



Acc# 
AB017194 



Plectonema boryanum ORF270, proline immopeptidase , terredoxin andamidase 
enhancer genes, complete and partial cds . 



ORF Name 



NTID 



NT AA 
AAID — _ — _ 
Length Length 



— . , Score 



Probability 



|24650962_t3_4b 


| |202 | |2122 


1 1 


|786 | |806 | 


|3.4e 


-80 


Protein name 








Locus Name 




Acc# 










sp : T2D1_STRPN 




P09356 


Description 














(R.DPNI) 




ORF Name 


NTID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


2473587B_t2JL6 


203 | 2123 


73 


Y n | |b4 | 


[0.017 | 


Protein name 








Locus Name 




Acc# 










sp:YMTO_YEAST 


Q04210 


Description 














HYPOTHETICAL 19 


2 KD PROTEIN IN SUB1-ARGR1 INTERGENIC REGION 








ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length — 


Probability 


|25391007_c2_110 


204 2124 


216 


651 |443 | 


|1.0e 


-41 


Protein name 








Locus Name 




Acc# 


N- ace tyl - annydromuramyi - L- alanine 


amidase 




gp:AF082575 


AF082575 



Description 



Pseuaomonas aeruginosa N- acetyl -annydromuramyi -L- alanine amidase ^ampDj and 
transmembrane protein AmpE (ampE) genes, complete cds. 



120 



ORF Name 



125662782 12 24 



Protein name 

Description 
PROTEIN HELA) 



NTID 

]EEEI 



AAID 



NT AA 
Length Length 

] F* — I [ 



— . , Score 



77T 



Probability 
] pg— | |2.7e-25 



Locus Name 



jsp : CCMA_RHOCA 



ACC# 
P29959 



ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


xrrO.Da.J3 11 lty 


289052_cl_65 


| 206 |2126 


I 154 1 


|465 | |220 | 


|4.3e-18 


Protein name 








Locus Name 


ACC# 


conserved hypothetical protein 




j pir:B75344 


! B75344 


Description 












ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|29301457_r3_44 


| 207 | 2127 


53 i 


282 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|29507800_c2_95 


208 2128 


| 397 


1194 |883 | 


|2.4e-88 


Protein name 








Locus Name 


ACC# 










sp:RP32J?5EAE 


P42378 


Description 












RNA POLYMERASE 


SIGMA- 3 2 FACTOR 










ORF Name 


NTID AAID 


NT 
Length 


— Score 
Length 


Probability 


|34569707_c3_131 


| 209 |2129 


9t i 


|288 | |74 | 


|0.023 



Protein name 



Locus Name 



F22C12.13 



gp:AC007764 



Acc# 
AC007764 



Description 



Genomic sequence tor Arabidopsis 
complete sequence . 



thaliana BAC F22C12 iromchromosome I, 



121 



ORF Name 



136335200 FT 13 



NTID 

pio 



AAID 



NT 



AA 



— , — , Score 



][ 



Length Length 



Probability 



7TT 



] E 



le-48 



Protein name 



Locus Name 



sp:YAEL_ECOLI 



Acc# 
P37764 



Description 

HYPO T HETICAL 49.1 KB PROTEIN IN CD^A-HLPA INTJjkOUNlO kEmON 



ORF Name 



I3552062S 12 31 



Protein name 



NTID 



AAID 



AA 

— Score 
Length Length 



TTTT 



NT 
n 



] l 771 1 | 723 | [ 



Probability 
|2.1e-71 



Locus Name 



UMP Kinase 



|gp:AB01QQ87 



ACC# 
AB010087 



Description 



Pseuaomonas aeruginosa rpsB, tst, pyrH, trr genes tor rifcosomaiprotem S2, 
elongation factor Ts, UMP kinase, ribosome recyclingf actor , complete cds . 



ORF Name 



NTID 



AAID 



[2T2~ 



NT AA 
Length Length 
TF7 — 



Score 



Probability 
|7.6e-60 



Protein name 



Locus Name 



riJDOSome recycling tactor 



|gp:A£010087 



Acc# 
AB010087 



Description 



Pseuaomonas aeruginosa rpsB, tst, pyrH, trr genes tor ribosomalprotein S2, 
elongation factor Ts, UMP kinase, ribosome recyclingf actor , complete cds. 



ORF Name 


NT 

NTID AAID — 

Length 


AA 

. — . , Score 
Length 


Probability 


|391068_£2_33 


213 | |2133 | |272 | 


|819 | |534 


|2.3e-5i 


Protein name 




Locus Name 


Acc# 






sp:UPPS_ECOLI 


1 Q47675:P75 


Description 






668 


{DI-TRANS-POLY-CIS 


- DECAPREN YLC 1 & TkAM JJ PKkAiJ U ) 






ORF Name 


NT 

NTID AAID , — . , 
Length 


AA 

T — . i Score 
Length 


Probability 


|39i5930_t3_50 


214 2134 | 204 


|615 | |592 


|1.6e-S7 | 



Protein name 
Description 

I TRANSKMTOIASE 1, (TK 1) 



Locus Name 



sp:TKTl_ECOLI 



Acc# 
P27302 



122 



ORF Name 



NTID 



AAID 



NT 



AA 



— , . — L , Score 



Length Length 



Probability 



3947932 ti 41 



Protein name 



] p 15 | p 135 | p33 | pOO I |125 | |2.0e-Q{> 

Locus Name Acc# 

P76370 



sp:YEEZ_ECOLI 



Description 

HYPO T HETICAL 29.7 KB PROTEIN IN SBCU-HISL 1NTERGENIC RE GION PRECURSOR 



ORF Name 


NTID 


AAID 


NT 
Length 


AA score 
jjeny Lu 


Probability 


|4ii0687_t2_30 


|216 


|2I36 


497 | 


|1494 | [1775 | 


|7.1e-183 | 


Protein name 








Locus Name 


Acc# 










sp:TKTi_ECOLl 


P27302 


Description 












TRANSKETOLASL! 1, 


(TK 1) 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|4345300_i2_20 


217 


|2137 j |989 


2970 [2958 | 


|0.0 


Protein name 








Locus Name 


Acc# 










sp:SYV_HAEIN 


P43834 


Description 












VaLYL-TRNa SYNTHETASE, (VALINE- - 


TRNA LlGAtiU) 


(VALR^) 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4495268_Cl_84 


IP 18 


| |2138 


1 110 1 


I 333 1 I 51 * 1 


|4.9e-49 


Protein name 








Locus Name 


ACC# 


terredoxin L3Fe- 


45 






pir : FEAV 


1 A29936:A00 


Description 










218 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4693768_tl_li 


219 


| 2139 


435 


|1308 | |854 | 


|2.8e-85 


Protein name 








Locus Name 


ACC# 



sp:DXR_ECOLI 



Description 
REDUCTOISOMURAat!) 



P45568 :P77 
209 



123 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


477232S_cl_69 J 


|220 


| |2140 


1 1" i 


|2«2 | |77 | 


[0.0071 | 


Protein name 










Locus Name 




Acc# 


cytochrome £> 




gp:ASA228475 


AJ228475 


Description 
















Andricus solitarius 


cytb gene. 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|510962<!>_ii_6 | 


|22i 


i p 141 


1 I 81 1 


|246 | 355 


|2.1e 


-32 


Protein name 










Locus Name 




ACC# 












sp:MTlA_MORBO 


P34720 


Description 
















METHyLtraNSFERAsE MBOI A) 


(M.MBOI 


A) 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|535028i_c3_139 


|222 


2142 


76 | 


231 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


|6823912_13_37 | 


|223 


|2143 


1 \ u 1 


192 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|103187_t2_5 | 


|224 


| |2144 


I r 


|297 | 







Protein name Locus Name Acc# 

Description 

[MO-HIT 



124 



ORF Name 



NT ID 



AAID 



NT 



AA 



Length Length 



Score 



15917 n 4 



Protein name 



Description 



[225 | [2145 | |U4 | |495 | pT~| 

Locus Name 



Probability 
|2.6e-27 



'sp:CYST_ECOLI 



Acc# 
P16701 



SULFATE TRANSPORT 


SYSTEM 


PERMEASE 


PROTEIN CVST 




ORF Name 


NT ID 


AAID 


NT AA 
— — Score 
Length Length 


Probability 


|20153930__cl_9 


| |226 


| |2146 


| 271 |813 | 502 | 


|5.6e-48 | 


Protein name 






Locus Name 


ACC# 



sp:RHLB_HAEIN 



P44922 



Description 
A T P -DEPENDENT kMA HELICASK RHLB H0M0L00 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|24257755_ci_8 


227 


2147 


155 


I 468 1 




Protein name 








Locus Name 


Acc# 


Description 












pjO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|165941S7_tl_5 


228 


| 2148 


| 510 


|IS33 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|22897255_±2_6 


1 229 


| 2149 


1 V 1 


|810 | |305 | 


|4.2e-27 



Protein name 



Locus Name 



putative acyltransterase 



gp:SCM10 



Acc# 
AL133469 



Description 
Streptomyces coelicolor cosmid M10. 



125 



ORF Name NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|24485937_t:*_13 | |230 


| 2150 | 


62 | 


|189 | |I47 | 


|1.7e 


-09 | 


Protein name 






Locus Name 




Acc# 


giutamate aenyarogenase 


gp:UAN010746 


AJ010746 


Description 












Antarctic bacterium TADl, 


one gene. 










ORF Name NTID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


|2501552_r3_9 |231 


|piU | 


|288 | 


|867 | 547 | 


|9.5e- 


1 


Protein name 






Locus Name 




ACC# 








sp:FTSH_ECOLI 




P28691 


Description 












CELL DIVISION PROTEIN fcTSH, 


ORF Name NTID 


AAID 


NT 
Length 


AA 

m — . , Score 
Length 


Probability 


|254i5636_ti_4 232 


i F^~n 


679 


2040 | |1I48| 


|7.4e- 


-181 | 



Protein name 



Description 



Locus Name 



sp:HTPG_ECOLI 



Acc# 
P10413 



| PROTEIN C62.5) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|2S366686_c2__24 


| 233 


| 2153 


| 731 


|2376 | 


|1520 | 


|7.5e-156 | 



Protein name 



Locus Name 



penicillin-binding protein 1A 



gp:PATJ73780 



Acc# 
U73780 



Description 



Pseudomonas aeruginosa penicillin-binding protein 1A 
cds, and malic enzyme gene, partial cds .. 


(ponA) 


gene, complete 


NT AA 

ORF Name NTID AAID , — . , „ — _ 

Length Length 


Score 


Probability 


|1230466iJ:2_i8 234 | |2154 | [584 | (1755 | 


|7S3 | 


|1.2e-75 


Protein name Locus 


Name 


Acc# 



|sp:RK(;N_ECOLI 



Description 

DNA kfc ! PAIR PROT E IN RECN (RECOMB I NATION PROTEIN M) 



P05824 :P76 
602 



126 



ORF Name 



16578133 c3 57 



NT ID 

]Ezn 



AAID 



NT AA 
— — Score 
Length Length 



l P^n f 



f7T- 



Probability 

] prrm 



Protein name 



Description 



Locus Name 



sp : PSBR_TOBAC 



Acc# 
Q40519 



PHOTOS YSTEM 11 


10 KD POLYPEPTIDE 


PRECURSOR 


(PII10) 






ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


19564510_t2_17 


|23S | |2156 


| 194 


pss | 


|444 | 


7.8e-42 | 



Protein name 



Locus Name 



N- tormylmethionylaminoacyl - tRNA detormylase , 



pir :S23107 



Description 



ORF Name 



Acc# 

S23107:S41 
694:A49696 
:B65121 



123554638 r3 29 



NTID 

]EZn 



AAID 



AA 

— Score 
Length Length 



NT 
n 
21T5 



] GEO 



531 



Probability 
|4.7e-51 



Protein name 



Locus Name 



beta-ketoacyl-acyl carrier protein synthase 
II 



gp:AF188707 



ACC# 
AF188707 



Description 



Photobacterium protundum acyl carrier protein (acpP) gene, partialcds; 
beta-ketoacyl-acyl carrier protein synthase II (fabF) gene , complete cds; and 
aminodeoxychorismate lyase (pabC) gene, partialcds. 



ORF Name 



NTID 



AAID 



23912502 ti 9 



2T5^" 



NT AA 
Length Length 



Score Probability 
1200 | | 5.6e-16 



Protein name 



Description 



Locus Name 



sp:YHHP_EC!OLl 



Acc# 
P37618 



HYPOTHETICAL 9.1 


KD PROTEIN IN FTSY- 


-NIKA IHTERGENIC REGION 




ORF Name 


NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


23985753_t3_27 


| |239 | |2159 | 


I 1 " 1 


|504 |273 | 


|1.0e-23 


Protein name 






Locus Name 


Acc# 



|gp:ECU28377 



U28377 



Description 

Escherichia coli K-12 genome; approximately 65 to 68 minutes. 



127 



ORF Name 



NTID 



AAID 



2430226J FT 5 



t f^ — i r 



NT 
Length 

] 



AA 

Length 



Probability 



Protein name 
nypotnetical protein b2948 



|582 | |340 | |8.2e-31 
Locus Name 



pir :C65080 



Acc# 
C65080 



Description 



ORF Name 



243534b8 t2 20 



NTID 



AAID 



NT 
Length 
-| [308 



AA 



Protein name 



Score Probability 

Length 

|927 | |671 | |6.9e-6& — 
Locus Name Acc# 



site-specitic recom&inase 



gp:AF033497 



AF033497 



Description 

I Proteus miraJDilis site-specitic recomDinase 



(xerD) gene, completecds . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — Score 
Length 


Probability 


245425S2_t2_13 


|242 


2152 


102 


309 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


3007832_t2_iy 


243 


2153 


| 169 


I 510 1 




Protein name 








Locus Name 


Acc# 


Description 












[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length ™ ■■ 


Probability 


35205013_t3_23 | 


|244 


|2154 




|10B6 | |291 | 


|1.3e-25 


Protein name 








Locus Name 


Acc# 


nypotnetical protein 






] pir:G7!>3BB 


| G75388 



Description 



128 



ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 




245 | |2U5 | 


pil | 


|536 | 


|l.le 


1 


Protein name 








Locus Name 




Acc# 


lmidazoleglycerol-pnospnate syntnase 


p±r :D69070 


D69070 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


488342b_ri_2 


| |24S | |2166 | 


|206 | 


|621 | |234 | 


|1.4e- 


1 


Protein name 








Locus Name 




Acc# 










sp:YQIA_ECOLI 




P36653 


Description 














HYPOTHETICAL 21.6 


KD PROTEIN IN PARE 


- ICC INTERGENIC REGION (F193) 




ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


6506_ti_3 


| |247 |2167 


637 


1914 |204i| 


|4.6e- 


-211 [ 



Protein name 



Locus Name 



topoisomerase IV summit 



E 



:AB003429 



Acc# 
AB003429 



Description 

Pseudomonas aeruginosa DNA tor topoisomerase IV subunit, completecds . 



ORF Name 



NTID 



AAID 



805180 cl 38 



][ 



12168 



NT AA 
Length Length 



— . , Score 



j | |554 | 



Probability 
|i.7e-53 



Protein name 
Description 

1MIDAZ0LEGLYC E R0L- PHOSPHATE DEHYDRATASE ; ( IflPD) 



Locus Name 



sp:HIS7_PEA 



Acc# 
Q43072 



ORF Name 



823381 13 24 



NTID 

]ehi 



AAID 



NT AA 
Length Length 

] EE! 



Score Probability 



TUT 



Protein name 
Description 
MO-HIT 



Locus Name 



Acc# 



129 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|U627£>l_cI_43 




| |2170 


i p i 


pi. | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|12281888_ci_40 


|25i 


| |217I 


78 


|237 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


1367177_i:2_i5 


| P » 


| |2i72 


1304 


| |674 | 


3.3e-56 


Protein name 








Locus Name 


Acc# 



'sp:GALU_ECOLI 



P25520 



Description 

URIDYL YLTRANSFERASE) (URIDINE DIPHOSPHOGLUCOSE PYROPHOSPHORYLASE) 



ORF Name 
|14463877__t3_23 



NT ID 



AAID 



TTTT 



NT 
n 



AA 

— Score 
Length Length 



Probability 
1 . Oe-23 



Protein name 



Locus Name 



sp:YJGQ_ECOLI 



Acc# 
P39341 



Description 

HYPOTH E T I CAL 39,8 KB PROTEIN I N PEPA-GNTV I NT E RGENIC REGION (0351) 



ORF Name 



NTID 



AAID 



15651!* t2 20 



NT AA 
Length Length 
T75" 



— , — , Score Probability 



EZZI 



Protein name 
Description 
IN0-H1T 



Locus Name 



Acc# 



130 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


16040927_c2_50 




| pi7S 


1 ^ 1 


P » | 




Protein name 








Locus Name 


Acc# 


Description 












NO -HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


16610183_c2_S4 


ip" 


| 2176 


1 1 315 i 


|933 | |569 | 


|4 .4e-55 


Protein name 








Locus Name 


ACC# 



sp:TEaB_EC'OLl 



P23911 



Description 
ACVL-TOA THIOUSTUKASE II, 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|16819827_ri_6 


257 


| 2177 


| 137 


414 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — i i Score 
Length - - - ■■ 


Probability 


19S3188S_c3_57 


258 


2178 


50 


183 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


19538388J:2_14 


ip" 


| 2179 


1 1" i 


|228 | |73 | 


|0. 015 


Protein name 








Locus Name 


ACC# 



E 



:SMI240618 



AJ240618 



Description 
Streptococcus mitxs xpt gene, strain 12261. 



131 



UKr .Name 


MTTTi 
IN ± _LU 




NT 
Length 


AA 

. — . , Score 
Length 


Probability 


120942936 r3 27 

1 - - 


260 


1 12180 1 
1 1 1 


376 1 
1 


|1131 | |1060| 


|4.1e 


-107 


Protein name 










Locus Name 




Acc# 












sp:GALEJBACSU 


P55180 


Description 
















GALACTOSE 4-EPIMERASE) 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


21675051_13_28 


261 


| 2151 | 


321 | 


|966 | |447 | 


|3 . 8e 


1 


Protein name 










Locus Name 




ACC# 












sp:YRFI_ECOLI 


P45803 


Description 
















HYPOTHETICAL 32 


.5 KD PROTEIN IN MRCA 


-PCKA INTERGENIC REGION 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|23557930_c3_6i 


252 


2182 


|619 


i860 |1768| 


|3.9e 


-182 



Protein name 



Locus Name 



glucosamine syntnase 



bp:AP032884 



Description 



Acc# 

AF032884 :L 
77909 



Thiobacillus terrooxidans N- ace tylglucosamme-l-phospnateur idyl trans t erase 
(glmU) gene, partial cds; glucosamine synthase (glmS) and RecG (recG) genes, 
complete cds; and transposon Tn5468 , complete sequence. 



ORF Name 



NTID 



AAID 



23634680 F2 18 



NT AA 

— — Score 

Length Length 

] EHI 



E 



Probability 
2.3e-35 



Protein name 



Locus Name 



putative UDP-glucose dehydrogenase 



|gp:ALW243431 



Acc# 
AJ243431 



Description 



Acmetobacter iwottii wzc, wzt>, wza, wee A, weeB, wceC, wzx, wzy, weeD, weeE, 
weeF, weeG, weeH, weel, weeJ, weeK, galU, ugd, pgi,galE, pgm (partial) and 
mip (partial) genes (emulsan biosyntheticgene cluster), strain RAG-1. 



132 



ORF Name 
|24400250_t3_24 
Protein name 

Description 



NTID AAID 



NT AA 
— — Score 
Length Length 



Probability 



] P 64 I p 184 | \ U0 | I 2583 | P^~l |6-4e-118 ~ 

Locus Name Acc# 



spiPLSfeJHAEIN 



P44857 



GLYCEROL- 3 -PHOSPHATE AOTLTRANSFERASE, (GPATJ 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


2449380i_t3_30 265 |2185 


1 P 75 1 


1128 | |489 


|i.3e-46 | 


Protein name 




Locus Name 


Acc# 


Faul DNA methyltransrerase 




] gp:AF029070 


AF029070 


Description 


Flavobactenum aquatile Faul DNA metnyl trans t erase itauiM) 


gene, complete 


cds . 








ORF Name NTID AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


26797302_c2_55 j | 2 £6 2185 


1 393 1 


1182 515 


2.3e-49 | 



Protein name 



Locus Name 



sp:YAIW_ECOLI 



Acc# 
P77562 



Description 

HYPOTHETICAL 40.4 KB PROTEIN W SBMA-E)E)LA INTERG ENlfg REGION 



ORF Name 



NTID AAID 



3317260 11 5 



NT AA 

— — Score 

Length Length 

F73 1 [T72T 



Probability 
2.9e-154 



Protein name 



Locus Name 



putative pnospnogiucose isomerase 



|gp:ALW24343i 



Acc# 
AJ243431 



Description 



Acmetobacter lwottii wzc, wzb, wza, weeA, weeB, wceC, wzx, wzy,weeD, weeE, 
weeF, weeG, weeH, weel, weeJ, weeK, galU, ugd, pgi,galE, pgm (partial) and 
mip (partial) genes (emulsan biosyntheticgene cluster), strain RAG-1. 



ORF Name 



3938762 12 22 



NTID 
■j |268 



AAID 



NT AA 
— — Score 
Length Length 



] F^n [ 



IT 



7T" 



Probability 
10 . 026 



Protein name 



Locus Name 



transcription regulator homolog yozG 



bir:C69931 



Acc# 
C69931 



Description 



133 



ORF Name 


NT ID 


AAID 


NT 
Length 


r — . , Score 
Length 


Probabi 1 i ty 


|672963S_c2_46 


| 269 




1 I 171 1 




1 in — nn/;^ 1 


Protein name 








Locus Name 


ACC# 


hypothetical protein C45H4 


. 14 




pir :T32722 


T32722 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — L , Score 
Length 


Probabil i ty 


|976387_r2_19 


| |270 


1 ziyu 


i r i 


1 i 


IvJ . UUz j 


Protein name 








Locus Name 


Acc# 


hypothetical protein T16L4 


. 170 




pir :T09929 


T09929 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


T — , i Score 
Length 


Probability 


10823462_Cl_13 


| 271 


| |2191 


1 


|204 




Protein name 
Description 








Locus Name 


Acc# 


MO -HIT j 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length ■ ■ - 


Probability 


|1237S535_r2_5 


272 


| 2192 


214 


|645 | |74 


| |0.0011 | 


Protein name 








Locus Name 


Acc# 










|gp:VCU39068 


U39068 


Description 












Vibrio cholerae 


pathogenicity island, partial and complete 


cds . 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|22065635_r2_9 


1 273 


| |2193 


1 1" 1 i 


|1566 | |1440 


| |2.2e-147 | 


Protein name 








Locus Name 


Acc# 


sodium/proline £ 
transporter opuE 


symporter opuE:proi 


me 


pir:H69670 


H69670 











Description 



134 



f-\TJ T7 1 XT^ma "NTT 1 T T\ 

UKr warne w i ijj 




NT 
Length 


AA 

, — . , Score 
Length 


Probability 


24228400_ci_20 | |274 


| |2i94 


i 479 i 


|1440 | |1110 | 


|2.1e 


-112 | 


Protein name 








Locus Name 




Acc# 










sp:HEMN_ECOLI 




Description 












P32131:P76 
772 


{ COPROPORPHYRINOGENASE ) (COPROGEN OXIDASE) 


ORF Name NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


25328457 ci 14 1 275 
- - 1 


| 2195 


98 | 


|297 |95 | 


|7.5e 


-05 


Protein name 








Locus Name 




Acc# 










sp:MINE_ECOLI 




P18198 


Description 














CELL DIVISION TOPOLOGICAL 


SPECIFICITY FACTOR 












ORF Name NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4880303_ci_19 | |275 


| 2195 


| 1*3 | 


|582 | 514 


3.0e- 


-49 


Protein name 








Locus Name 




Acc# 










sp:PTH_HAEIN 


P44682 


Description 














PEPTIDYL-TRNA HYDROLASE, ( 


PTH) 














ORF Name NT ID 


AAID 


NT 
Length 


AA 

, — , , Score 
Length 


Probability 


|6835875_t2_4 | |277 


| 2197 


50 


183 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 




ORF Name NTID 


AAID 


NT 
Length 


AA 

T — Score 
Length ■ 


Probability 


|900011_C3_28 | 278 


|2198 


| 234 


|705 | 259 


2.7e- 


1 


Protein name 








Locus Name 




Acc# 


probable nbosomal protein 


L2 5 






pir:H71555 


H71665 



Description 



135 



ORF Name 



9869006 ti 2 



NTID 
"j [279 



AAID 



]EE!ZI[ 



NT AA 
Length Length 
|219 



TI 



Score Probability 
12 71 | | 1.7e-23 ~ 



Protein name 



Locus Name 



3 OS suJDunit niDosomai protein S21 



gp:AF014397 



Acc# 
AF014397 



Description 



Pseudomonas put Ida macromolecular syntnesis operon: 3 OS suJDunitribosomai 
protein S21 (rpsU) , DNA primase (dnaG) , and sigma- 70 (rpoD) genes, complete 
cds . 



ORF Name 



NTID AAID 



11885875 c3 76 



NT 
n 



AA 

— Score 
Length Length 



11358 



Probability 
1218 j |7.5e-124 — 



Protein name 



Locus Name 



Acc# 



sp:Y164_HAL!lN 



Description 






1 P43955:P43 

956 


HYPOTHETICAL PROTEIN HI0164/165 


ORF Name NTID AAID 


NT 

Length 


AA 

. — , , Score 
Length 


Probability 


12S8778i_c3_70 | 281 |220i 


174 


525 512 


4.9e-49 


Protein name 




Locus Name 


Acc# 



Description 



sp:IF3_HAEIN 



P43814 



TRANSLATION INITIATION FACTOR IF-3 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


14093967_ci_49 | 282 |2202 


211 


(636 | |74i | 


|2.Ge-73 


Protein name 




Locus Name 


ACC# 


NqrE 


|gp:AF165980 


1 AF165980 


Description 



NADH-quinone oxidoreductasecomplex operon, 



complete sequence. 



136 



ORF Name 



NTID 



AAID 



14b40908 c!S 7 7 



751" 



77UT 



NT AA 
Length Length 
] |270 



Score 



[2TT 



T73~ 



Protein name 



Locus Name 



Probability 
|B.2e-45 

ACC# 



NqrC 



gp:AF117331 



AF117331 



Description 



Vibrio cholerae N16 961 Na+- translocating NADH-ubiquinoneoxidoreductase 
enzyme complex, complete sequence. 



ORF Name 



NTID 



AAID 



115865712 n 71 



NT AA 
Length Length 



— . , Score 



J |570 | 



Probability 
|6.0e-i2 



Protein name 



Description 



Locus Name 



gp:ECOUW93 



Acc# 
U14003 



Escherichia coli K-12 chromosomal region trom 92.8 to 00.1 minutes. 



ORF Name 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



15460432 c2 65 



73" 



3TT 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



MO-HIT 



ORF Name 



NTID 



AAID 



22038177 13 27 



NT 
n 
557 



AA 

— Score 
Length Length 



Probability 



Protein name 



[1674 | [1801 | | 1.2e-185 

Locus Name Acc# 



putative efflux pump component MtrF 



gp:AF176821 



AF176821 



Description 



Neisseria gonorrhoeae strain EU7 5 putative efflux pump componentMtrF (mtrF; 
gene, complete cds . 



ORF Name 



124423260 cl 42 



][ 



NTID 
287 



AAID 



NT AA 
— — Score 

Length Length 



EE3 



TTT 



Probability 
2 . 5e-05 



Protein name 



Locus Name 



pr2 



gp:MHU19289 



Acc# 
U19289 



Description 



Mycoplasma hyopneumoniae J ATCC 2 7219 multidrug resistance protemhomoiogs 
prl and pr2 genes, complete cds, and 23S rRNA gene, partial sequence. 



137 



ORF Name 



NT ID AAID 



NT 



AA 



Length Length 



Score Probability 



25392778 El 1 



EST 



Protein name 



[2205 | |20i | |S06 | p7~ | |8.6e-36 ~ 

Locus Name Acc# 



4-hydroxypnenyiacetate 3-monooxygenase (EC 



gp:t>90737 



Description 



D90737:AB0 
01340 



Eschencnia coxi 


genomic 


DNA. (22.8 


- 23.1 


mm) . 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


31268837J:3_28 


| 289 


| |2209 


412 


1239 |183S| 


|2.4e-189 ] 


Protein name 








Locus Name 


Acc# 



Description 



sp : CATA_HAEIN 



P44390 



C AT ALAS E, 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


3322329i_r2J.9 


| 290 


|2210 


r 


P" 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


3323b9!47_c2_59 


| 291 


2211 


|782 


|2349 | |1415| 


1.0e-144 


Protein name 








Locus Name 


ACC# 



Description 
VACB PROTEIN 



sp:VACB_ECOLl 



P21499:P76 
800 



ORF Name 
33857132 ±1 12 



Protein name 
Description 



NTID 



AAID 



— — -u Score Probability 



T92~ 



NT AA 
Length Length 
1 ISTT 



Locus Name 



Acc# 



NO-HIT 



138 



ORF Name 



NT ID AAID 



NT AA 
— — Score 
Length Length 



Probability 



13399183 c2 Si 



Protein name 



] I 253 1 \ 22U 1 I 415 | P 48 | p^F] p.Ue-129 

Locus Name 



TTqrTT 



gp:AF11733i 



Acc# 
AF117331 



Description 



Vibrio cnoierae N16961 Na+-transiocatmg NADH-urnquinoneoxidoreauctase 
enzyme complex, complete sequence. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


34000785_c3J73 


1 P 94 


|2214 




185 




Protein name 








Locus Name 


Acc# 


Description 












[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


34196052_c2_63 


295 


| 2215 


| 414- 


|12S1 | J1650 | 


|1.2e-169 



Protein name 



Locus Name 



NqrP 



gp:AP117331 



ACC# 
AF117331 



Description 



Vibrio cnoierae N16961 Na+-translocating NADH-ubiquinoneoxidorecluctase 
enzyme complex, complete sequence. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


3939043_c2_58 


| 296 


| |2216 


542 


1929 


(2200 | 


|6.5e 


-228 


Protein name 










Locus 


Name 




Acc# 












sp : SYT 


_HAEIN 




P43014 


Description 


















( THRRS ) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


4720967_c2J>2 


I 297 


| |2217 


i p" i 


P 4 1 




|9.8e- 


-67 


Protein name 










Locus 


Name 




Acc# 



sp:Y16tt_HAEIM 



Description 
HYPO T HE T ICAL PROTEIN HI0158/159 



P43958 :P43 
959 



139 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


473137_cl_41 


|29B 


| |2218 


i i vt i 


pi i 






Protein name 








Locus Name 




ACC# 


Description 














NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4801625_12_20 | 


|299 


| |2219 


| 252 


|759 | |622 | 


l.le 


-60 


Protein name 








Locus Name 




ACC# 










|sp:HIS4_RH0SH 




P50936 


Description 














ISOMERASE, 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|58822ii_t2_i4 


poo 


| |2220 


118 


P bV | |ibi | 


|2 .7e 


I 


Protein name 








Locus Name 




ACC# 


1 hypothetical protein 1 






pir : S47051 




S47051 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


68264I_c2_55 


301 


2221 


85 


|261 | |100 | 


|2 .2e 


-05 


Protein name 








Locus Name 




Acc# 


nypotnetical protein phu^iv 




pir:G71244 




G71244 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


14103377_r2_9 1 


302 


| |2222 


I 1 " 1 


|501 | |434 | 


9.0e 


1 


Protein name 








Locus Name 




ACC# 










|sp:MTGA_ACICA 




024849 



Description 

I (EC 2.4.2.-) (MONOFUNCTIONAL 



140 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


16973437_c3_30 


|303 


| |2223 


i r 


P 40 | 




Protein name 








Locus Name 


Acc# 


Description 












[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — ■ i Score 
Length 


Probability 


|19745308_tl_3 


IF 4 


|2224 




poi | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|24261257_i3_i4 


pos 


| 2225 


ir 


|372 | |140 | 


|8.0e-09 


Protein name 








Locus Name 


Acc# 



Description 



sp : PN<L H B_SALTY 



P22253 



| NICOTINATE 


PHOSPHORIBOSYLTRAN^FERASE , (NAPRTASE) 






ORF Name 


NT AA 

NTID AAID „ _ _ 

Length Length 


Score 


Probability 


|2560092b_ti_ 


2 305 2226 91 |275 | 


I 58 1 


\9.9e-0S | 



Protein name 

Description 
( E C 2.4.2.-) ( MONOFUNCT I ONAL Td3A£>E) 



Locus Name 



|sp:MTGA_ACICA 



Acc# 
024849 



ORF Name 



NTID 



AAID 



130659433 c2 21 



TUT 



][ 



TZTT 



NT AA 
Length Length 

[55 1 mr 



— — Score Probability 



Protein name 
Description 
[ NO-HIT 



Locus Name 



ACC# 



141 



ORF Name 



NTID AAID 



13303178 ti i 



][ 



NT AA 
Length Length 
T79~ 



— , Score 



] EIO ED [ 



Probability 
|2.4e-40 



Protein name 



Locus Name 



solanesyi diphosphate syntnase 



gp:AB001997 



Acc# 
AB001997 



Description 



Rnodobacter capsulatus DNA tor solanesyi diphosphate 


syntnase 


, complete cds. 


NT AA 

ORF Name NTID AAID . — . , , — L , 

Length Length 


Score 


Probability 


35182887_c2_23 309 | 2229 | 191 |576 | 


684 | 


2.9e-67 | 



Protein name 



Description 



Locus Name 



sp:IPYR_HAEIN 



Acc# 
P44529 



PHO5PHO-HYDR0LA5E) 


(PPAaU) 








1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


3512S655_ri_4 


310 


2230 - 


i " 4 i 


1125 1283 | 


9.7e-131 | 


Protein name 








Locus Name 


ACC# 



Description 



sp:AROC_HAEIN 



P43875 



PH05PH0LYASE) 1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


6834702_r2_ii 


311 


|223i 


1 162 1 


489 | |37i | 


|4.3e-34 | 


Protein name 








Locus Name 


ACC# 



Description 



sp:YCHJJftE I N 



P44609 



HYPOTHETICAL 


PROTEIN HI0277 










ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


882636_cl_i5 


312 2232 


258 


|777 | 


417 


5.7e-39 | 



Protein name 



Locus Name 



lipoate biosynthesis protein B 



gp:AF147448 



Acc# 
AF147448 



Description 



PseucLomonas aeruginosa strain PAOl penicillin-binding protein 2 (pbpA) , 
rod- shape-determining protein (rodA) , membrane -bound lytictransglycosylase 
(mltB) , rare lipoprotein A (rlpA) , penicillin-binding protein 5 (dacA) , and 
lipoate biosynthesisprotein B (lipB) genes, complete cds; and unknown gene. 



142 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


973756_c3_34 


| 313 


| |2233 


138 


1*" i 




Protein name 








Locus Name 


ACC# 


Description 












pTO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|975055_r2_10 


| 314 


| 2234 


i ,4s i 


2238 | |2349 | 


|l.le-243 | 



Protein name 



Locus Name 



polyphosphate Kinase 



ACRBDOXN 



Acc# 
Z46863 



Description 



Acmetobacter sp. cysD, cobQ, sodM, lysS, rubA, rubB, estB, oxyR , ppk , mtgA, 
0RF2 and 0RF3 genes. 



ORF Name 



I10673EI87 tl 4 



Protein name 



NT ID AAID 



Score 



TTET 



] i 



NT AA 
Length Length 



buz 



Probability 
5.3e-123 



Locus Name 



sp:TVRB_ECOLI 



Acc# 
P04693 



Description 

| AROMATIC- AMINO- ACID AMINOTRANSFERASE , (AROAT) (ARMT 



ORF Name 



NTID 



AAID 



NT 



AA 



Length Length 



Score Probability 



14572162 tl 1 



] |223S | |26'0 | |783 | 



7.0e-57 



Protein name 



Description 



Locus Name 



sp:YCIK_ECOLI 



ACC# 

P31808 :P77 
516 



(ec i. 


ORF Name 


NT AA 
NTID AAID , — . , _ — , , Score 
Length Length 


Probability 


20126385_r2_8 


317 J 2237 198 |597 | 325 


3 .2e-29 


Protein name 


Locus Name 


ACC# 




sp:YTFL_ECOLI 


P39319 


Description 






HYPOTHETICAL 49 


8 KD PROTEIN IN CYSO-MSRA INTERGENIC REGION 





143 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


2609375_c2_26 


ii 518 


| |2238 




F 9 1 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


26808317_t2_6 


i p 19 


| |2239 




| 576 | 


|8.1e 


-56 


Protein name 








Locus Name 




Acc# 










sp:UBIG_ECOLI 






Description 












P17993 :P76 
924 


METHYLTRANS FERAS E ) 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


34334050_i3_i5 


| |320 


| |2240 


ir 


|927 | |889 1 


|5.5e 


-89 


Protein name 








Locus Name 




ACC# 










sp:YTFLJBCOLI 




P39319 


Description 














HYPOTHETICAL 45.8 


KD PROTEIN IN CYSQ-MSRA INTERGENIC REGION 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


|3Sii568_i2_7 


| 321 


|224i 


|233 


|702 257 | 


|4.5e- 


-23 


Protein name 








Locus Name 




Acc# 










sp:GPHC_ALCEU 




P40852 


Description 














PHOS PHOGLYCOLATE PHOSPHATASE , CHROMOSOMAL , 


(PGP) 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — , , Score 
Length 


Probability 


4117193_c2_25 


| 322 


|2242 


||M* 


1521 | 953 


|7.9e- 


-97 


Protein name 








Locus Name 




ACC# 


leucine aminopeptidase 






|gp:PPU0i026i 




AJ010261 


Description 















Pseudomonas putida pepA gene. 



144 



ORF Name 



4144818 t3 16 



][ 



NT ID AAID 



TIT 



NT AA 
Length Length 

] EE " ~ 



Score Probability 



] 



TuTT9~ 



b.9e-76 



Protein name 



Locus Name 



probable lonictransporter 



pir :F70819 



Acc# 
F70819 



Description 



ORF Name 



4976550 tl 3 



NT ID AAID 



NT AA 

— — ^, Score Probability 
Length Length -L 



TI&T 



312 



] [ 



SIT 



|4.6e-!i7 



Protein name 



Description 



Locus Name 



jsp: YBHD_ECOLI 



Acc# 

P52696 :P75 
761 



HYPOTHETICAL TRANSCRIPTIONAL REGULATOR IK M0DC-BI0A INTERGENIC REGION 



ORF Name 



NT ID 



AAID 



NT AA 
— — Score 
Length Length 



1441017 cl 38 



TIT 



] P 8 I P 57 | 



TIT 



Protein name 



opacity protein opa5l 



Description 



Locus Name 



E 



ir :S36329 



Probability 
i.2e-07 



ACC# 

S36329:S28 
628 



ORF Name 



114462827 c3 b3 



Protein name 



NT ID 



TIT 



AAID 



?I7ZT 



ribosomal protein S15 



Description 

ORF Name 
|14494026_c2_50 
Protein name 



NT ID 



AAID 



TIT 



titt 



Description 
ATP PH03PH0R1B0& YLTRAN5FERASE , 



NT AA 
— — Score 
Length Length 



TT 



J |270 | pgr 



Locus Name 



pir :S38882 



NT AA 
— — Score 
Length Length 



Probability 
I.0e-25 



Acc# 
S38882 



Probability 



219 | |bbO | |500 | |9.1e-45 



Locus Name 



sp:HI5i_BACSU 



Acc# 
034520 



145 



ORF Name 



NT ID AAID 



14509582 c2 45 



TJ5~ 



NT AA 
Length Length 



Score Probability 
] [230 | |3.7e-i9 ~ 



Protein name 



Locus Name 



gp:VCU390S8 



Acc# 
U39068 



Description 

Vibrio cholerae patnogenicity island, partial ana complete cds . 



ORF Name 



115776b C2 48 



NTID 



TIT 



AAID 



NT AA 
— — Score 
Length Length 



95" 



■J |29i | |182 | ^ 



Probability 
|4.5e-14 



Protein name 



Description 



Locus Name 



sp:YRPM_ACICA 



Acc# 
P33989 



| HYPOTHETICAL 9 


2 KD PROTEIN 1M RPON- 


-MURA 1NTURGENIC REGION 


(0RF3) 


ORF Name 


NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


|16510933_c2_52 


330 2250 


|525 1578 £45 


i.6e-52 


Protein name 




Locus Name 


Acc# 



Description 



sp : PUMB_ECOLI 



P14407:P78 
139 



j FUMARATE HYDRATASE CLASS 


I, ANAEROBIC, (FUMARASE) 




ORF Name NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


j2344b93i_c2_51 331 


2251 


454 


|1365 942 


i.3e-94 


Protein name 








Locus Name 


Acc# 


histidmol dehydrogenase 




pir :E70368 


~~| E70368 


Description 












ORF Name NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


23650253_c2_49 | |332 


2252 


1 ^ 1 


1256 |1337| 


[1.8e-136 


Protein name 








Locus Name 


Acc# 



P33986 



Description 

I TRANSFERASE) (KPT) 



146 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


2381950_c3_58 




| 12253 


1 61 


(186 | 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|23867127_ci_41 


IP 4 


| [2254 


| 275 


|828 | |147 | 


|5.2e-i0 


Protein name 








Locus Name 


ACC# 



Description 



sp:YRAP_ECOLI 



P45467 



(0191) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


2439!>637_ci_40 


IP 5 


| 2255 


| 149 


|450 | 


143 | 


6.2e-10 



Protein name 



Description 



Locus Name 



sp:YRAN_EC0LI 



ACC# 
P45465 



HYPOTHETICAL 


14 . 8 KD PROTEIN IN AfiAI 


-MTR INTERGENIC REGION 


(0131) 






ORF Name 


NTID AAID 


NT 
Length 


AA 

— Score 
Length ■ - ■■ 


Probability 


34010260_tl_l 


| |336 J2256 | 


|119 


|360 | |204 


2 . le 


-16 


Protein name 








Locus Name 




ACC# 


general stres 


s protein nomolog ykzA 




i 


pir : F69870 


F69870 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


5079i88_t3_35 


| |337 |2257 | 


163 | 


|492 | |46I ; 


1.2e- 


1 


Protein name 








Locus Name 




Acc# 


hypothetical 


protein 




1 


gp:ASA224767 


AJ224767 


Description 






AcmetoJDacter 


sp. ADPl ion gene ana 


ORFs. 













147 



ORF Name 



NTID 



AAID 



53300B7 cJ 61 



3T5~ 



22T5" 



NT 
Length 
T7u 1 



AA 
Length 
[1113 | |52? 



Score Probability 
li.7e-92 



Protein name 



Description 



Locus Name 



sp:HIS8_ACEXY 



Acc# 
P45358 



PHOSPHATE TRANSAMINASE ) 



ORF Name 
|9S4837_c233" 



][ 



NTID AAID 

i 



339 



NT 
Length 

^2T9 J |699 



AA 

— , Score 
Length 



Probability 



Protein name 



[2100 | [2198 | |i.ie-227 

Locus Name Acc# 



polyribonucleotide nucleotidyltransferase 



gp:PPV18132 



Y18132 



Description 



Pseudomonas putida rpsO and pnp genes . 



ORF Name 



NTID AAID 



NT 



S59392 ti 13 



Length 
22"£7T J |7T 



AA 
Length 

1222 — 



Score Probability 



Protein name 
Description 



Locus Name 



Acc# 



[NO-HIT 



ORF Name 



NTID 



AAID 



1070165 c3 42 



225T" 



NT 
Length 
72 



AA 

Length 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



[NO-HIT 



ORF Name 



10993750 11 2 



NTID 
|342 



AAID 



][ 



22S2" 



NT 
Length 
|137 



AA 

— , Score 
Length 



ED 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



[NO-HIT 



148 



ORF Name 



NT ID AAID 



NT AA 

— , — , S core Probability 
Length Length 



20SS4577 c3 43 



Protein name 



prooafcie acyi-CoA aenydrogenase 



[2263 | |550 | [1583 | [1389 | |5.7e-142 

Locus Name Acc# 

B75282 



(pir:B75282 



Description 



ORF Name 



NTID AAID 



NT AA 
— , „ — ^ Score 
Length Length 



124395191 cl 31 



3T3~ 



2264 



Probability 

71 [ |0. Oil 



Protein name 



Locus Name 



conserved hypothetical protein aq_l236 



pir :F70406 



ACC# 
F70406 



Description 



ORF Name 



33804680 C2 35 



NTID AAID 

I F* 5 "" 



Score 



Probability 



NT AA 
Length Length 

] | 796 | \ 2 ^ 91 | | 709 | |4.4e-72 



Protein name 



Locus Name 



site-specitic recomJDinase 



gp:NGU82253 



ACC# 
U82253 



Description 

Neisseria gonorrnoeae site-specitic recomJDinase (gcrj gene, complete eels . 



ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


34085i65J:3JJ0 


| |345 2255 


495 


|1491 


|1327 | 


2.1e-135 


Protein name 






LOCUS 


Name 


Acc# 








sp:RPSD_PSEAE 


P26480 


Description 












RNA POLYMERASE 


SIGMA FACTOR RPOD 


(S1GMA-70) 








ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


35823505_c2_32 


|347 2257 


514 


i 1545 1 


I 1343 1 


4.3e-137 



Protein name 



Locus Name 



Butyryl-CoA: Acetate Coenzyme A transterase 



gp : CTACTAGEN 



Acc# 
Z69031 



Description 
C. thermosaccharolyticum actA gene. 



149 



ORF Name 



NTID AAID 



— — Score Probability 



135939753 c2 34 



NT AA 
Length Length 
|73 | |222 | |105 | |7.2e-05 



Protein name 



Locus Name 



probable acyl-CoA dehydrogenase 



lpir:&75282 



Acc# 
B75282 



Description 



ORF Name 



NTID AAID 



NT AA 
— , — , Score 
Length Length 



3917193 c2 33 



] P 49 | p 269 | [ 



35" 



E5D [ 



Protein name 



Locus Name 



Probability 

|2.9e-09 

Acc# 



prooaoie acyl-CoA dehydrogenase 



:B75282 



B75282 



Description 



ORF Name 



NTID AAID 



3954817 Cl 27 



350 



2270 



NT AA 
Length Length 
TF9 1 ITOu — 



Score 



Protein name 



Locus Name 



Probability 
|2.9e-35 

Acc# 



probable acyl-CoA dehydrogenase 



pir:B75282 



B75282 



Description 



ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



5167157 c2 39 



T5T" 



2271 



TFT" 



Probability 
104 | |8.4e-06 



Protein name 



Locus Name 



hypothetical protein PH1801 



bir:A71191 



Acc# 
A71191 



Description 



ORF Name 



NTID AAID 



19923125 C2 40 



352 | [2272 | [ 



NT AA 
Length Length 
73 1 \I7T 



Score Probability 



Protein name 



Description 



Locus Name 



ACC# 



|CT 



'O-HIT 



150 



ORF Name 



NTID AAID 



10ibb437 T2 5 



75T 



][ 



tut 



NT AA 
Length Length 

] Em 



Score 



] I 



Probability 
|2.0e-li 



Protein name 

Description 
(HMP-P KINASE) 



Locus Name 



sp:THID_HAEIN 



Acc# 
P44697 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23912827_c3_10 




| 2274 


i r 5 i 


240 | 






Protein name 










Locus Name 




Acc# 


Description 
















pTO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

m — . , Score 
Length 


Probability 


|35267912_c3_ll 


355 


2275 


306 


|921 | |483 


5.8e 




Protein name 










Locus Name 




Acc# 












sp:PROC_HAEIN 


P43869 


Description 
















PYRROLINE - 5 -CARBOXYLATE REDUCTASE, 


(P5CR) (P5C 


REDUCTASE) 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|4062840_c2_9 


| 356 


(2276 


| 191 


|575 | 206 | 


|1.3e 


-16 


Protein name 










Locus Name 




Acc# 












sp : YGGT_HAEIN 


P44097 


Description 
















HYPOTHETICAL 


PROTEIN HI1036 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|10164063_c2_87 


|357 


2277 


242 i 


729 | 425 | 


8.1e- 


-40 


Protein name 










Locus Name 




ACC# 












sp:YAEB_ECOLI 


P28634 



Description 

I HYPOTHETICAL 26.4 KB PROTEIN IN PR05-RC5 F IN T ERGENIC REGION (0RF3) 



151 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


14568812_c3_97 




| |2278 


426 | 


|1281 |287 | 


in — c P „-57 — — 1 


Protein name 










Locus Name 


Acc# 


1 pror>aiDle lipD protein 


pir:<370634 


G70634 


Description 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


JrX KjSJdiiJ J. J_ lty 


|14901512_c3_103 


|359 


2279 


156 | 


r 


71 | |210 | 


|4.9e-17 1 
1 _ . 1 


Protein name 










Locus Name 


Acc# 












sp:HIT_BACSU 


007513 


Description 














HIT PROTEIN | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|164813_r3_52 


| 360 


2280 


| 431 


|1296 |1416 | 


|7.8e-145 | 


Protein name 










Locus Name 


Acc# 



Description 



gp:AB025342 



AB025342 



Monteila marina genes, complete cds, similar to eicosapentaenoicacid 
synthesis gene cluster. 



ORF Name 



[T7TT 



68763 tl 16 



NTID 



AAID 



2281 



NT 
n 

JTT 



AA 

— Score 
Length Length 



Probability 
17 . 8e-106 



Protein name 



Description 



Locus Name 



sp:HEM2_PSEAE 



Acc# 
Q59643 



SYNTHASE) (ALAD) 


(ALADH) 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23444400_c3_92 


|362 


| |2282 


| 336 


|10ii 


|9.4e-117 


Protein name 








Locus Name 


Acc# 



sp :RUVB_ECOLI 



P08577 



Description 
HOLLIDAY JUNC TI ON DNA H E LICAS E RUVB 



152 



ORF Name 



NTID 



AAID 



123525552 c2 83 



NT 
n 



AA 

— Score 
Length Length 



Probability 
1259 | |316 | |2.9e-28 — 



Protein name 



Locus Name 



conserved Hypothetical protein yueb' 



] bir:G7u007 



Acc# 
G70007 



Description 



ORF Name 



123595281 tl 17 



Protein name 



NTID 



AAID 



NT 
n 
73T 



AA 

— , Score 
Length Length 



] [ 



Probability 
8.4e-235 



Locus Name 



hypothetical protein b2463 



|pir:F5502i 



Acc# 
F65021 



Description 



ORF Name 



NTID 



AAID 



23828428 13 bV 



NT AA 
Length Length 
|272 | |813 | 



Score Probability 
12 50 | |1.6e-40 — 



Protein name 



Locus Name 



aldoketoreductase 



[gp 



:AF001865 



ACC# 
AF001865 



Description 



Leisnmania mexicana amazonensis aldoketoreductase (PTR-i; gene, complete cds. 



ORF Name 



NTID 



AAID 



24250012 Cl 66 



NT AA 
Length Length 
"j [1731 



[576 



Score Probability 
[X TuT] p.Oe-112 — 



Protein name 



Locus Name 



glycine betaine transporter BetL 



E 



:AF102174 



Acc# 
AF102174 



Description 



Listeria monocytogenes glycine betaine transporter BetL (fcetL) gene, complete 
cds . 



ORF Name 



NTID 



AAID 



NT AA 
— , — , Score 
Length Length 



124313512 t2 37 



7ZT 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



[NO-HIT 



153 



ORF Name 



124317157 n 55 



Protein name 



Description 



NO-HIT 



NTID 



AAID 



— , — , Score Probability 



T5TT 



NT AA 
Length Length 

1 1 175 I ED 



Locus Name 



Acc# 



ORF Name 



12517175 ti 18 



Protein name 
Description 



MO-HIT 



NTID 

1 F*~ 



AAID 



NT AA 
Length Length 

□ EH 



Score Probability 



T5" 



Locus Name 



Acc# 



ORF Name 



29376681 tl 1 



Protein name 



Description 



NO-HIT 



NTID 



AAID 



T7TT 



NT AA 
Length Length 



Score Probability 



] i 



^5" 



Locus Name 



Acc# 



ORF Name 



130360452 tl 6 



Protein name 
Description 



KO-HIT 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



1371" 



][ 



Locus Name 



Probability 



Acc# 



ORF Name 



30662517 c3 106 



Protein name 



NTID AAID 



TTT 



NT 
n 



— Score 
Length Length 



11440 



Probability 
3 . Oe-49 



Locus Name 



sp:ACRE_ECOLI 



Acc# 
P24180 



Description 

ACRIFLAVIN RESISTANC E PRO TE IN E PRECURSOR (ENVC PROTEIN) 



154 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|31423292 c2 80 


373 | 2293 


1 1308 1 
1 1 1 


1927 1 1327 1 
1 III 


p.Oe 


-29 | 


Protein name 






Locus Name 




ACC# 


Hypothetical 


protein Rv024lc 




j pir:E70938 




E70938 


Description 












ORF Name 


NT ID AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


|31455_f3_54 


I 1374 1 12294 

II II 


1 71 1 
1 1 


1215 1 
1 1 






Protein name 






Locus Name 




Acc# 


Description 












NO-HIT 












ORF Name 


NT ID AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


34110435 t2 30 


| 375 | |2295 


1115 | 
1 1 


1351 1 183 1 
1 III 


|0.030 | 


Protein name 






Locus Name 




ACC# 


microtilarial 


sheath protein SHP3 




|gp:LSU54555 




U54556 


Description 












Litomosoides 


sigmodontis microtilarial sheath protein SHP3a 


(shp3a) and 




microf ilarial 


sheath protein SHP3 


(shp3) genes, complete cds . 








ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


4147i93_t2_29 


j 376 | |2296 


| 


1908 | 1551 | 


3.4e 


-242 | 


Protein name 






Locus Name 




Acc# 


dihydroxy- ac i 


d dehydratase, 




|pir : DWECDA 






Description 










A27310 :D26 
570:S48894 
:S30669:F6 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4350088 c3 95 


| 377 | |2297 


1 l 4 ^ 1 


|1377 | |853 | 


B.le- 


-86 


Protein name 






Locus Name 




ACC# 








gp:MLCB1883 




AL022486 


Description 













Mycobacterium leprae cosmid B1883. 



155 



ORF Name 



NT ID AAID 



4381318 £3 56 



T77T 



NT AA 

— — Score 

Length Length 

TSu 1 



] EEZl 



Probability 
9.0e-57 



Protein name 



Locus Name 



sp:CCA_ECOLI 



Acc# 
P06961 



Description 

( T RW A CCA-EYROPHOSPHOUYLA^E) (CCA- AD DING ENZYME) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4712b37_cl_60 


|379 


| |2299 


1 l liV 1 


p 54 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|47S9050_c2_79 


ipso 


| |2300 


| 99 


300 |117 | 


|3.5e-07 


Protein name 








Locus Name 


Acc# 


nypotnetical protein APEO^yb 




pir :B72732 


| B72732 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


5266540_ii_8 


381 


2301 


219 


|660 | 




Protein name 








Locus Name 


Acc# 


Description 












pjO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


p500I2_fi_7 


382 


| |2302 


| 313 


|942 | |952 


|I.2e-95 


Protein name 








Locus Name 


Acc# 


rerredoxin- -NADP+ 


reductase, 




pir:A57432 




Description 










1 A57432:A53 

967 



156 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


S59725G_cl_62 


383 


2303 


| 78 


p" i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT J 


ORF Name 


NTID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


|68i7i91_c2_B9 


|384 


| |2304 


i 5,5 i 


|2925 | [2816 | 


|3.5e-293 | 


Protein name 








Locus Name 


Acc# 



sp:YHIV_L!c~oLl 



P37637 



Description 

HYPO T HE T ICAL lll.b KB fkOTKlN IN HDED-GADA IN T KR^ENIC kfcloltW 



ORF Name 


NT AA 

NTID AAID „ _ ^ 

Length Length 


Score 


Probability 


781302_c3_98 


| 385 | 2305 185 |558 | 




|4.1e-52 | 


Protein name 


Locus 


Name 


Acc# 




sp:HPRT_ECOLI 


P36766 


Description 








HYP0XANTH1NU 


PH0SPH0RIB05 YLTRANSFERAy U , (HPRT) 




1 


ORF Name 


NT AA 
NTID AAID _ — — _ 
Length Length 


Score 


Probability 


|100305_c3_168 


| |385 | |2305 | |251 | |756 | 


528 | 


|9.8e-51 | 



Protein name 



Description 



Locus Name 



sp:YHHW_ECOLI 



Acc# 
P46852 



HYPOTHETICAL 25 


.3 KD PROTEIN IN GNTR-GGT INTERGENIC 


REGION 


(P231) 


ORF Name 


NT AA 
NTID AAID Le - th Le - th 


Score 


Probability 


10604658_t2_36 


387 | 2307 | 488 |1467 


1 P M 


| |1.7e-69 



Protein name 



Locus Name 



RdxB 



|gp:RSU67862 



Acc# 
U67862 



Description 



Rhodobacter spnaeroides rdxB ana raxH genes, complete cas, ana ccoPand raxi 
genes, partial cds . 



157 



ORF Name 



NT ID AAID 



AA 

— , Score 
Length Length 



T51T 



NT 
n 
TT7 



T75~ 



Probability 
1.2e-13 



Protein name 



Locus Name 



hypotnetical protein R186.1 



bir:T24235 



Acc# 
T24235 



Description 



ORF Name 



NT ID 



AAID 



Score Probability 



1272201 cJ 1SU 



Protein name 



NT AA 
Length Length 

[507 | [109 | |8.2e-05 

Locus Name Acc# 



hypothetical protein spac869.06c 



pir:T39117 



] 



T39117 



Description 



ORF Name 



130U00b0 rl 26 



NT ID 



AAID 



NT AA 
Length Length 




Score 



Probability 



Protein name 



[201 | |74 | [0.021 

Locus Name Acc# 



PilT 



p:STAF000001 



EI 



Description 



AF000001:A 
F013957 



Salmonella typhi topoisomerase B (topB) , single strand nindmgprotein (ssb) , 
Ytl2 homolog (ytl2) genes, complete cds; pil operon, complete sequence; Rci 
(rci) gene, complete cds. 



ORF Name 



NT ID 



AAID 



— — Score Probability 



13723751 c3 176 



NT AA 
Length Length 



Protein name 



12 7 2 | [1357 | |1.4e-i3lj ~ 
Locus Name Acc# 



PixNd 



|gp:RLPlXND 



Z80339 



Description 



r. leguminosarum rixNd. ana tixoa genes. 



ORF Name 



NT ID 



AAID 



NT AA 
— , — ^ Score 
Length Length 



1 140 tl 11 



[ 392 | [2312 | [ 



Protein name 



Description 



Locus Name 



Probability 



Acc# 



[NO-HIT 



158 



ORF Name NT ID AAID 


NT 
Length 


AA 

t — . i Score 
Length ~ ■ 


Probability 


14550251_cl_125 393 2313 


1 243 
1 


|732 | 385 | 


1.4e 




Protein name 






Locus Name 




ACC# 








sp : YGBP_HAEIN 


005029 


Description 












HYPOTHETICAL PROTEIN HI0672 


ORF Name NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|155251J:2_53 394 | |2314 


i i its 


|477 | |522 | 


|4.3e 


-50 


Protein name 






Locus Name 




Acc# 



sp :RL13_HAEIN 



P44387 



Description 
505 RIB050MAL PRO T EIN L13 



ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|1585945S_r3_74 395 | 2315 


|95 


|29i | |105 | 


|6.6e 


-06 | 


Protein name 






Locus Name 




ACC# 


hypotneticai protein PH0639 




pir:H71108 


H71108 


Description 












ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|158038ii_£l_i3 | |396 | |23i6 


|216 


I 651 1 I s7 1 


|0.040 


Protein name 






Locus Name 




ACC# 


somatostatin sst2B receptor 




gp:RNSST2B 


X98234 


Description 












R.norvegicus mRNA tor somatostatin 


receptor. 








i 


ORF Name NTID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


16853590_c3_i64 397 | |2317 


1 1 


|702 255 


|7.3e- 


-23 



Protein name 



Locus Name 



sp:VEAZ_ECOLI 



Description 

HYPO T H ET ICAL 25.2 KB PRO TE IN IN VAUD-PABB 1NT ERGENIC REGION 



Acc# 

P76256 :O08 
476 :O08477 



159 



ORF Name 



NTID AAID 



119563312 c2 127 



I2TTF" 



NT AA 
Length Length 



Score 



Probability 
0.038 



Protein name 



Locus Name 



sp:WAB_BACSU 



Acc# 
P37523 



Description 

HYPOTHETICAL 1 7 .0 KB PROTEIN IN SPO0J-G I DB IH T ERGENIC REGION 



ORF Name 



NTID AAID 



19632661 t l i 91 



2TT9" 



NT 
Length 

irn 



AA 

_ — Score 
Length 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



[NO-HIT 



ORF Name 



NTID AAID 



203577 cl 9b 



Fnnr 



NT 
Length 

751 



AA 
Length 
12256 



Score 



Protein name 



Description 



Locus Name 



sp:CLPB_HAEIN 



Probability 

jl.Ie-266 

Acc# 
P44403 



CLPB PROTEIN 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


21988931_£3J*y 


|401 


2321 


211 | 


|635 | |563 | 


|1.9e-54 | 


Protein name 








Locus Name 


Acc# 



Description 



l S p:UCRI_CHRVI 



031214 



(RIE3KE IRON- SULFUR 


PROTEIN) (RISP) 










ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


22066661_t2_40 | 


|402 | 2322 


|191 


i" s i 


■154 | 


p.4e-33 | 



Protein name 

Description 
HYPOTHETICAL PROTEIN HI1034 



Locus Name 



sp:YAJQ_HAEIN 



Acc# 
P44096 



160 



ORF Name 



NT ID AAID 



AA 

— Score 
Length Length 



23525307 c2 146 



TUT 



11 



NT 



] EEZI 



Probability 
3.5e-58 



Protein name 



Locus Name 



cytocnrome-c oxidase, type cdd3 cnain tixO 



pir:S77596 



Acc# 
S77596 



Description 

ORF Name 
|23720002_c2_140 

Protein name 
Description 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



] ED 



Locus Name 



Probability 



Acc# 



MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23860681_t2_39 


| 405 


| |2325 


455 | 


|1368 | |1917 | 


|6.4e-198 


Protein name 








Locus Name 


ACC# 



Description 



sp:ASSY_HAEIN 



P44315 



LIGASE) 1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


23864iB0_±I_I8 


405 




2326 


271 


l au ! 


254 


1. le-21 



Protein name 



Locus Name 



CorE 



lgp:AP130857 



Acc# 
AF130857 



Description 

Salmonella typnimunum cobalt resistance locus, partial sequence. 



ORF Name 



123947151 11 19 



Protein name 



NTID 



AAID 



wut 



TTZT 



NT AA 

— — Score 

Length Length 

TUT 



[JUT 



TTT 



Probability 
2.2e-07 



Locus Name 



unknown 



gp:AF147448 



Acc# 
AF147448 



Description 



Pseudomonas aeruginosa strain PAOl penicillin-binding protein 2 (popA) , ~~ 
rod- shape -determining protein (rodA) , membrane -bound lytictransglycosylase 
(mltB) , rare lipoprotein A (rlpA) , penicillin-binding protein 5 (dacA) , and 
lipoate biosynthesisprotein B (lipB) genes, complete cds; and unknown gene. 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability- 


24083208J:3_82 


408 


2328 


i 71 t 


pis | 






Protein name 








Locus Name 




ACC# 


Description 














NO-HIT 




NT ID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


2427187b_cl_122 | 


|409 


| |2329 


i !"» i 


(1677 | [1857 | 


U.5e 


-191 j 


Protein name 








Locus Name 




Acc# 










|sp:PYRG_HAEIM 




P44341 


Description 














CTP SYNTHASE, (UTP- 


-AMMONIA LIGASE) (CTP SYNTHETASE) 




i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — , _ Score 
Length 


Probability 


24337827_tl_15 | 


|410 


1 2330 


KiLJ 


1068 |1038 | 


|8.9e 


-105 


Protein name 








Locus Name 




Acc# 


dinyaroorotase , 








[pir :T10453 




T10453 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . . Score 
Length 


Probability 


|24344138_t3_68 


|4ii 


2331 


70 


213 






Protein name 








Locus Name 




ACC# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


24417875_clJ.24 | 


|412 


| |2332 


1 1 141 i 


p i \ uh i 


4.3e- 


-09 


Protein name 








Locus Name 




Acc# 



sp:YGBQ_HAEIN 



P44035 



Description 
HYPOTHETICAL PROTEIN HI0673 



162 



ORF Name 
|24500m_c3_163 



NT ID AAID 



TTTT 



NT AA 
Length Length 
1 11560 



Score Probability 
11458 I l2.(S>e-i52 



Protein name 

Description 
SIGNAL RECOGNITION PARTICLE PROTEIN 



Locus Name 



sp:SR54_ECOLI 



Acc# 
P07019 



(PIPTY-FOUR H0M0L0G) (P48) 



ORF Name 



24648402 tl 22 



NTID AAID 

i 



NT 



AA 



Length Length 



Score Probability 



TUT 



2334 | [1298 | [3897 | |386 | |5.4e-59 



Protein name 



Locus Name 



probable exonuclease, 



J |pir:Tu3465 



Acc# 
T03465 



Description 



ORF Name 



NTID AAID 



24844562 c3 167 



][ 



NT AA 
— — Score 
Length Length 



Protein name 



Locus Name 



probable pitB protein 



foir:E70731 



Probability 

|i.5e-150 

Acc# 
E70731 



Description 



ORF Name 



NTID AAID 



29880042 t3 83 



NT AA 
Length Length 
WE 1 11458 



Score 



5T7~ 



Probability 
3.1e-61 



Protein name 

Description 
EX0NUGLEA5E 5BCD 



Locus Name 



sp:SBCE)_ECOLI 



Acc# 
P13457 



ORF Name 



NTID AAID 



3166026 ±3 87 



TTTT 



NT AA 
Length Length 
Tl 1 [2uT 



— — Score Probability 



Protein name 
Description 

MO-HIT 



Locus Name 



Acc# 



163 



ORF Name 



NT ID AAID 



NT 
Length 



AA 

— Score 
Length 



33204808 ci 101 



350 



1053 I |231 | 



Probability 
|2.8e-18 



Protein name 



Locus Name 



Hypothetical protein rvs /z 



pir:MV1694 



Acc# 
E71694 



Description 

ORF Name 
|3367535^tr35" 



NTID 



AAID 



NT 
Length 

H2I 



AA 

_ — Score 
Length 



Probability 
|2.3e-138 



Protein name 



Description 



Locus Name 



sp:CYB_CHRVI 



Acc# 
031215 



CYTOCHROME 


5 














ORF Name 




NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


33707182_tl_ 


27 


II 420 


|2340 


252 j 


I 759 I 


364 


2.7e-4<5 | 



Protein name 

Description 
CYTOCHROME CI PRECURSOR 



Locus Name 



sp:CVl_CHRVI 



Acc# 
031216 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


33875885_c3_i57 


|42i 


2341 


68 


|207 | 




Protein name 








Locus Name 


ACC# 


Description 












[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

x — . i Score 
Length - ■ 


Probability 


|3405458i_ci_119 


| 422 


2342 


87 


P" 1 I 71 1 


J0.026 | 



Protein name 



Locus Name 



cb-type cytochrome c oxidase CcoQ subunit 



gp:AB024290 



Acc# 
AB024290 



Description 



Magnetospirillum magnetotacticum ccoN, ccoO, 
cytochrome c oxidase, complete cds. 



ccoQ, ccoP gene torcfc-type 



164 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


34120251 cl 105 


423 


2343 


322 


|969 | |647 | 


|2.4e 


1 


Protein name 










Locus Name 




ACC# 












sp:UBIA_ECOLI 


P26601 


Description 


















POLYPRfclNYLTRANSFURASE) 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|36379680_c2_127 


424 


| 2344 


60 | 


p. . 






Protein name 










Locus Name 




ACC# 


Description 
















MO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


3906686_c3_155 




| 2345 


553 | 


1962 12231 1 


|3 .4e 


-231 


Protein name 










Locus Name 




Acc# 












sp : GIDA_PSEPU 


P25756 


Description 
















GLUCOSE INHIBITED 


DIVISION 


PROTEIN 


A 








1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|3932753_C2_149 


|426 


| 2346 


I I 


12304 j 235 | 


|1.3e 


-16 | 


Protein name 










Locus Name 




Acc# 












sp:REC2_HAEIN 


P44408 


Description 
















RECOMBINATION PROTEIN 2 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


3942318_t2_54 




| |2347 


131 | 


|396 | |507 | 


[1. 7e 


-48 j 


Protein name 










Locus Name 




Acc# 












sp:RS9_HAES0 


P31782 



Description 

I 305 RIBOSOMAL PROTEIN 59 



165 



ORF Name 



NTID 



AAID 



AA 

— , Score 
Length Length 



3947193 Tl 55 



2348 



NT 
n 

TTI 



Probability 



Protein name 



Description 



|399 | |311 | | 9.7e-28 ~ 
Locus Name Acc# 



sp:SSE>B_HAEIN 



P45206 



STRINGENT STARVATION PROTEIN B HOMOLOG 1 


ORF Name 


NT 

NTID AAID . — , 
Length 


AA 

_ — . , Score 
Length 


Probability 


4119075_Cl_103 


| 429 | |2349 | |251 | 


|84S | |464 | 


[6. Oe-44 


Protein name 




Locus Name 


Acc# 



Description 



sp:BACA_ECOLI 



P31054 :P39 
203 



| (EC 2.7.1 


66) 














ORF Name 




NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|4334463_c3_ 


172 


430 


| 2350 


| 169 


pu 


|70 | 


p.8e-05 



Protein name 



Locus Name 



unknown 



AF083916 



Acc# 
AF083916 



Description 



Rnizobium etli Fnr-type transcriptional regulator FnrNc (tnrNc) gene, 
complete cds; and unknown genes. 



ORF Name 



NTID 



AAID 



4798193 c3 178 



NT AA 
Length Length 
TSB 1 11077 



Score 



Probability 
5.6e-48 



Protein name 



Locus Name 



cytochrome -c oxidase, tixP chain : cb-type 
cytochrome-c oxidase 32K chain : cytochrome 
b410:fixP protein 



pir:D47468 



ACC# 
D47468 



Description 



ORF Name 



NTID 



AAID 



AA 

— , Score 
Length Length 



B00017 13 73 



NT 
n 

7¥T 



] EfZI m [ 



Probability 
|4.6e-53 



Protein name 
Description 

R I BON T JCLEASE T, ( EXOR I BON U CLE AS E T) (RNASU T) 



Locus Name 



sp:RNT_VIBPA 



Acc# 
P46232 



166 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


520003 cl 125 


433 


2353 


67 


r 


04 | 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 














l 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


52uT453_c3__181 


1 1434 


1 12354 
1 1 


1 K45 1 
1 1 1 


|1338 | [1467 | 


|3 .le 


-150 


Protein name 










Locus Name 




Acc# 












sp:ENO_ECOLI 


P08324 


Description 
















GLYCERATE HYDRO - LYAS E ) 












i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — i , Score 
Length 


Probability 


5281318_c3_180 


||4»b 


| p 3 bb 


1 290 
| 


|873 | |1037 | 


l.le 


-104 


Protein name 










Locus Name 




Acc# 


2 - dehydro - 3 - deoxypnospnooc tonate 


aldolase 


gp:AP098791 




AF098791 


Description 
















Pseudomonas aeruginosa 2- 
complete cds . 


■dehydro- 


-3 -deoxypnospnooc tonate aldolase (JcdsA) gene, 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|5901067_ci_104 


| 435 


| 2356 


274 1 


|825 | 202 


|3.5e- 


1 


Protein name 










Locus Name 




Acc# 












sp:YHTQ_HAETN 




P44901 


Description 
















HYPOTHETICAL 


PROTEIN HI 084 9 










i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , . Score 
Length 


Probability 


|7054650_ci_ii8 


1 l 4iV 


| |2357 


62 


189 |53 | 


JO.Oib | 


Protein name 










Locus Name 




Acc# 


ORP-D 








gp:EC010KLS 




D11109 



Description 



E. con gene tor 10K-L and 10K-S protein. 



167 



ORF Name 



NTID AAID 



957705 cl 113 



NT AA 
Length Length 
TIB 1 11008 



Score Probability 
[407 | |6.5e-!*a ~ 



Protein name 



Locus Name 



putative regulatory protein 



gp:AE0874U2 



Acc# 
AF087482 



Description 



Rseudomonas aeruginosa cicc ana onbH genes, Lys-R type reguiatoryprotem 
(clcR) , chlorocatechol-1, 2 -dioxygenase (clcA) , chloromuconate cycloisomerase 
(clcB) , dienelactone hydrolase (clcD) , maleylacetate reductase (clcE) , 
transposase (tnpA) ,ATP-binding protein (tnpB) , putative regulatory protein 
(ohbR) ,o-halobenzoate dioxygenase reductase (ohbA) , o-halobenzoatedioxygenase 
alpha subunit (ohbB) , o-halobenzoate dioxygenase betasubunit (ohbC) , 



ORF Name 



9960917 i3 90 



NTID 



AAID 



][ 



TJT9" 



NT AA 
Length Length 

-m — 



Score Probability 
13 54 | |2.7e-32 ~ 



Protein name 



Description 



Locus Name 



sp:SS£A_ECOLl 



Acc# 
P05838 



STRINGENT STARVATION PROTEIN A 



ORF Name 



10632090 il 17 



NTID 
] |440 



AAID 



NT AA 

— — Score 

Length Length 

] I 506 I F^" 



Probability 
9 . 8e-99 



Protein name 



Description 



Locus Name 



spiNUONJSCOLl 



Acc# 

P33608:P78 
281 



OXIDOREDUCTASE CHAIN 14) (NU014) 



ORF Name 



NTID AAID 



NT AA 
— , — _ Score 
Length Length 



106946b cl Bb 



7T 



1228 



Protein name 
Description 



Locus Name 



Probability 



ACC# 



INO-HIT 



ORF Name 



NTID AAID 



NT AA 
— , — , S core 
Length Length 



10734830 cl 



T5T 



] P 3g2 I P | [ 



TFT" 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



[NO-HIT 



168 



ORF Name 



1385390 t3 54 



Protein name 



Description 



NT ID AAID 



NT AA 
— — Score 
Length Length 



Probability 

2363 | |216 | |65i | p^~| |4.0e-38 ~ 



Locus Name 



sp:KTU0J_EC0LI 



Acc# 

P33605:P78 
236 



0XID0REDUCTA5E CHAIN 10 J (NUO10) 


ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


13863425_t2_23 | 444 2364 


| 276 


|83i | |480 | 


|1.2e-45 


Protein name 






Locus Name 


Acc# 


hypothetical protein RP682 




pir:E71674 


E71674 


Description 










ORF Name NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


14454827_±2_28 | 445 |2365 


211 


636 | 561 


|3.ie-54 | 


Protein name 






Locus Name 


ACC# 


pyridoxamme 5-pnospnate oxidase 


pir:B75513 


B75513 



Description 
ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



Probability 



114475702 ci 90 



Protein name 



[2366 | | 259 | [780 | |5T 1 [0.00081 ~ 

Locus Name Acc# 



ORES 



EE 



D78257 



D78257 



Description 



Enterococcus taecaiis plasmid pYliv genes tor BacA, BacB, ORF3,ORF4, ORF5, 
ORF6, ORF7, ORF8 , ORF9, ORF10, ORF11 , partial cds . 



ORF Name 



NTID AAID 



14578202 ti 12 



2367 



NT AA 
— — Score 

Length Length 

pr 



T52" 



75T 



Probability 
|1.2e-75 



Protein name 



Locus Name 



sp:NOOI_BC0LI 



Description 

I OX I DOREDUCTAaij CkAIN 9) (NU09) 



Acc# 

P33604 :P76 
488:P78183 



169 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|15625443_clJ34 


44« 


2368 


61 


186 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|175760_t3_46 


| 449 


2369 


1 21b 1 


648 | |352 


|4.4e-32 



Protein name 



Locus Name 



NADH dehydrogenase chain A 



|gp:AF057063 



ACC# 
AF057063 



Description 



Erwinia carotovora subsp. carotovora aspartate ammotransterase (aat) gene, 
partial cds; HexA (hexA) , NADH dehydrogenase chain A(nuoA) , and NADH 
dehydrogenase chain B (nuoB) genes, complete cds; and NADH dehydrogenase chain 
C (nuoC) gene, partial cds. 



ORF Name 



NT ID AAID 



19806577 t2 27 



NT AA 

— — Score 

Length Length 

1 rrrzT 



] i 



[TT77~ 



Protein name 



Description 



Locus Name 



sp:MRSA_HAEIN 



Probability 
|1.7e- 119 

ACC# 
P45164 



MR3A PROTEIN H0M0L0G 


ORF Name NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|2110657J:i_3 451 


1 2371 


| 328 


|987 | |760 | 


|2.6e-75 


Protein name 






Locus Name 


Acc# 



sp:Y926_5YNY3 



P72872 



Description 
HYPOTHETICAL 37.9 KD PROTEIN SLL0926 



ORF Name 



NT ID 



AAID 



22402252 t2 25 



W5T 



TT7T 



NT AA 
Length Length 



— , , Score 



] eo 



Protein name 
Description 
[NO-HIT 



Locus Name 



Probability 



Acc# 



170 



ORF Name 



23683215 F2 38 



Protein name 



Description 



NT ID 



AAID 



TTTT 



NT AA 
Length Length 

] | 579 | I 1740 



Score Probability 
11 583 | | 1.6e-152 



Locus Name 



sp:NUOM_ECOLI 



Acc# 

P31978 :P78 
248 



1 OXIDOREDUCTASE CHAIN 13) (NU013) 


ORF Name NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


24225213_13_50 454 2374 | 


255 | 


|80i | 1183 | 


|3.8e-120 


Protein name 






Locus Name 


Acc# 


TOU2 






gp:AF058589 


AF058689 


Description 










Neisseria meningitidis strain Z2491, 


genomic 


sequence . 


1 


ORF Name NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|24225502_c3_132 455 |2375 | 


p/o | 


|813 |888 


|7.0e-89 | 


Protein name 






Locus Name 


ACC# 








sp:Y572_HAEIN 


| P44758 


Description 










HYPOTHETICAL PROTEIN HI0572 


ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|24391557_li_10 455 |2375 | 


1045 | 


|3141 1555 


8.5e-252 | 


Protein name 






Locus Name 


Acc# 


NADH dehydrogenase (ubiquinone) , I cnain 




pir :A65000 




G : nuoK protein 








1 A65000 :S65 










Description 








638 :S38316 
:S37064 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|24542893_ti_15 | 457 2377 | 


519 | 


1850 |1809| 


|1.8e-186 | 


Protein name 






Locus Name 


ACC# 



sp:NU0L_EC0LI 



Description 
OXIDOREDTJCTASE CHAIN 12) (MU012) 



P33607:P78 
254 



171 



ORF Name 



NT ID 



AAID 



2507285 i2 22 



NT AA 
Length Length 
7TS — 



Score 



|642 | |770 | [ 



Probability 
2.2e-75 



Protein name 



Locus Name 



outer membrane protein bi 



AF045251 



Acc# 
AF045251 



Description 

Moraxella catarrnalis outer membrane protein Bi gene, complete cds. 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


25392135_i2_26 


ii* m 


| |2379 


i p i 


I 186 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


25579763_t3_6i 


II 460 


| |2380 


i 291 i 


846 374 


2.3e-39 | 


Protein name 








Locus Name 


Acc# 



Description 



jsp : FKNR_ECOLI 



P28861 :P11 
007 



(FLXR) (FLDR) (METHYL VIOLOGEN RESISTANCE PROTEIN A) (DAI) 




ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


2S228401_c2_105 | 451 | 2381 


| 155 


|471 | 123 


8.ie-08 | 


Protein name 




Locus Name 


Acc# 


hypothetical protein APE1413 




pir :D72619 


D72619 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


29S88176_tl_l 452 2382 


70 


213 304 


4.8e-25 


Protein name 




Locus Name 


ACC# 


transterrin-Dindmg protein 2 precursor 


gp:AF10525i 


AF105251 


Description 








Moraxella catarrhalis transterrin-binding protein 2 precursor (ompBl) gene, 
partial cds. 





172 



ORF Name 



NTID AAID 



|3008^9J ti bl 



[31TT 



NT 
n 



AA 

— _ Score 
Length Length 



Probability 
| 13 7 B | |8.3e-141 — 



Protein name 



Locus Name 



IsprNUOIM^OOLl 



Description 
OXlDOkL!DUCTA!jlj CHAIN M (NUObj 



ACC# 

P31979:P78 
239 



ORF Name 



30252036 c2 98 



NTID AAID 
■J |454 I 12384 



] i 



NT AA 
Length Length 
S3 1 [TST 



Score Probability 



Protein name 



Description 



Locus Name 



Acc# 



JO-HIT 



ORF Name 



I312834S2 ti 11 



Protein name 



Description 



NTID AAID 
— 



NT AA 
— , — n Score 
Length Length 



Probability 



T3TT5" 



342 | [1029 | [1125 | |4.2e-114 

Locus Name Acc# 



sp:NUOH_KLmi 



P33603:P78 
307 



| OXIDOREDUCTAyi! 


CHAIN 8) (NU08) 






i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|3182057_c3_m 


| 455 |2386 


515 | 


1551 |15>55 j 


|4.1e-203 | 


Protein name 






Locus Name 


Acc# 



Description 



lsp:5YR_HAL!lN 



P43832 



ARGINYL-TRNA SVNTHtlTAyii! , 


(ARGININfc!- 


-TRNA LIGASE) (AkGkiJ) 


i 


ORF Name NTID 


AAID 


NT AA 
— — , Score 
Length Length 


Probability 


33723387_ti_5 457 


| |2387 | 


235 |708 | |799 | 


|1.9e-75 


Protein name 




Locus Name 


Acc# 



sp:MU01i_t;cJ0Ll 



Description 



P33598 :P78 
090 



173 



ORF Name 



NTID AAID 



33772186 13 41 



^5" 



NT 
n 



AA 

— Score 
Length Length 



Probability 



T75TT 



[1501 | |i.9e-164 



Protein name 



Locus Name 



transferrin binding protein b 



gp:AF039313 



Acc# 
AF039313 



Description 



MoraxelXa catarrnalis strain 
complete cds. 


LES-l transferrin binding protein Bltt>pB) gene, 


NT AA 

ORF Name NTID AAID „ — . , . — . , Score Probability 

Length Length 


|34176950_t3_42 469 


2389 | |548 1647 331 |2.2e-29 | 



Protein name 



Description 



Locus Name 



sp:Y170_METJA 



ACC# 

Q57634 



HYPOTHETICAL PROTEIN MJ0170 


ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


34414552 _i3_47 | 470 | 2390 | 


584 


|1755 | |2190 | 


|7.5e 


-227 


Protein name 






Locus Name 




ACC# 


NADH denydrogenase (ubiquinone), I, 
C-D 


chain 




pir :D65000 


D65000 :S38 










Description 










313 :S38312 
:S65634:S6 


ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


35166075_tl_4 | 471 2391 


294 


885 | 315 | 


3 .7e- 


-28 | 


Protein name 






Locus Name 




ACC# 


periplasmic chaperone protein 




gp:AF095845 


AF095845 



Description 



Pseudomonas syringae cell division/ stress response protein (ttsK) and 
periplasmic chaperone protein (lolA) genes, complete cds. 



ORF Name 



36144687 13 49 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



] F^H EI 



Probability 
35TT~[ |i.2e-20 ~ 



Protein name 

Description 
OXIDOREDUCTASE CHAIN 4) (MU04J (FRAGMENT) 



Locus Name 



sp:N00D_SALTY 



Acc# 
P33902 



174 



ORF Name 



NTID AAID 



NT 
Length 



AA 

r — ^ Score 
Length 



[2393 | 



Probability 
|i.7e-14 



Protein name 



Locus Name 



gp:ECPMC7A 



Description 

j E.coii Plasmid pMccC7 mccA, B, C, D, E, F genes. 



Acc# 
X57583 



ORF Name 
|4740902_c2JL27 



NTID 



AAID 



] 



NT 
Length 

\m — 



AA 

T — ^ Score 
Length 



] CO EH] [ 



Probability 
0.00043 



Protein name 



Description 



Locus Name 



sp:PRXH_BPMD2 



Acc# 
064252 



1 PUTATIVE NON-HEME HALO PEROXIDASE , 


ORF Name NTID AAID 


NT 
Length 


— Score 
Length 


Probability 


|4796875_ti_5 475 2395 


78 


|237 | 144 


4.8e-10 | 


Protein name 




Locus Name 


Acc# 


conserved nypotnetical protein 




pir:H75273 


| H75273 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|5097885J:1JL4 475 | 2395 




|438 | |320 | 


|i.ie-28 


Protein name 




Locus Name 


Acc# 



Description 



sp:NU0K_EC0LI 



P33606 :P76 
487:P78182 



| OXIDOREDUCTAaE 


CHAIN 11) 


(NU011) 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — , , Score 
Length 


Probability 


|7226452_±1_9 


477 


| 2397 


i r 4 


|525 | |470 | 


|1.4e-44 


Protein name 








Locus Name 


ACC# 



sp:NUOE_SALTY 



P33903 



Description 
OXlDOmiDUC T ASE CHAIN 5) (NUOB) 



175 



ORF Name 



NT ID AAID 



NT AA 
— — Score 
Length Length 



Probability 



10181576_t2_42 


478 239B 


101 








Protein name 






Locus Name 




Acc# 


Description 












pjO-HIT 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|10751312_t 1_7 


| |479 | |2399 


1 1939 1 
1 1 1 


12820 1 1710 1 
1 III 


|2.9e 


-114 


Protein name 






Locus Name 




Acc# 








|sp:YCBY_HAEIN 




P44524:P43 
945 


Description 










| HYPOTHETICAL 


PROTEIN HI0II6/II5 








i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|I0975302_ci_93 


| |480 | |2400 


1 P*» 1 


|882 | |185 


2.5e 


i 


Protein name 






Locus Name 




ACC# 


prooafcle D, D-carooxypeptidase 




[pir:B71353 




B71353 


Description 












ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


i95S7762_cij;7 


| |48i 2401 


i r i 


I 270 1 






Protein name 






Locus Name 




Acc# 


Description 












NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

t — i i Score 
Length 


Probability 


|19735877_t2_!}4 


|482 | |2402 


i w 


192 







Protein name 



Locus Name 



Acc# 



Description 



my- 



176 



ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


21491075_c3_jL27 


483 2403 


517 


1554 309 | 


1.4e-4i 


Protein name 








Locus Name 


ACC# 


CjaB protein 


gp:CJE1797i 


Y17971 


Description 




Campylobacter jejuni cjaB gene. 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|21520276_c3_136 


| |484 | |2404 


i r ih i 


|828 | 




Protein name 








Locus Name 


Acc# 


Description 












pTO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


PTobahil "i t~v 

i -1- ± _1_ _l_ l» y 


21603403_C3_126 


485 2405 


1 \ bU 


1632 |857 | 


|1.3e-85 | 


Protein name 








Locus Name 


Acc# 










sp:YMBC_BCOLI 


P75919 


Description 












HYPOTHETICAL 55.9 


KD PROTEIN IN CSGC-MDOG INTERGENIC REGION 




ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|21679677_r3_58 


486 | |2406 


| 476 


|1431 | |1649| 


ll.6e-169 


Protein name 








Locus Name 


Acc# 








sp : GLNA_AZOVI 


P22248 


Description 












1 GLUT AMINE SYNTHETASE , (GLUTAMATE- 


-AMMONIA LIGASE) 




ORF Name 


NTID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


22306532_c3_134 


| |487 | |2407 


1 P 55 1 


1768 1 426 1 


|6.3e-40 | 


Protein name 








Locus Name 


Acc# 








sp : LPSA_PASHA 


Q05770 



Description 



LPSA PROTEIN 



177 



ORF Name 


NTID AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


22442010 tl l 


488 2408 


354 


1055 |450 | 


H..8e 


-42 


Protein name 








Locus Name 




Acc# 


unKnown 






|gp:AFiiS284 




AF116284 


Description 














Pseudomonas aeruginosa DnaJ-liJce protein gene, 


complete cds; 


andunKnown 


genes . 














ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


2375337_t3_49 


|489 | 2409 | 


50 


|183 | 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


23944431_c2_116 


490 2410 


80 


243 | |106 | 


|5.1e 


-06 


Protein name 








Locus Name 




Acc# 


nypotnetical protein APE0029 


pir:H72754 




H72754 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23945931_t3_5B 


| 491 | 2411 | 


Ml | 


|1041 | |135 


I.2e 


-06 


Protein name 








Locus Name 




ACC# 


nypotnetical protein slrll66 




pir:S7b877 




S75877 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


2395451i_ti_6 


| 492 2412 


811 


|243€ |2745 | 


|1.2e 


-285 


Protein name 








Locus Name 




Acc# 



Description 
(PEP SYNTHASE) 



sp : PPSA_ECOLI 



P23538 



178 



ORF Name 



NTID AAID 



AA 

— Score 
Length Length 



123989752 cl 84 



TZTT 



NT 

ill 



501 



Probability 
|288 | |i.0e-42 — 



Protein name 
Description 

DEHYDRATASE) 



Locus Name 



sp : 3DHQ_NEUCR 



Acc# 
P05195 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|24306512_c2_99 


| |494 


2414 


| 202 


|609 | |509 | 


|I.0e-48 | 


Protein name 








Locus Name 


Acc# 










sp:GCHi_0ST0S 


061573 


Description 












OTP 0YCL0HYDR0LA5E 


I , (GTP- 


-CH-I) 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|24337752_i2_32 


ii 455 


|2415 


1 I 378 


[1137 | |884 


i.9e-88 | 


Protein name 








Locus Name 


Acc# 



sp:YDA0_EC0LI 



Description 

I HYPOTHETICAL 35.6 KD PROTEIN IN DBPA-INTR INTERGENIG REGION 



P76055:Q47 
558 



ORF Name 



124646887 tl 16 



NTID 



AAID 



12416 



NT AA 
Length Length 

pio 



Score Probability 



TF9 - 



Protein name 
Description 

GrcmTT 



Locus Name 



Acc# 



ORF Name 



NTID 



AAID 



|2488i717_t2_39 | |497 | [2417 | 



NT AA 
Length Length 

TU7 



— , — , Score Probability 



324 



Protein name 
Description 

MO-HIT 



Locus Name 



Acc# 



179 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|2559S262_fc3_68 


| 495 


2418 


168 


507 




Protein name 








Locus Name 


Acc# 


Description 












pjQ-HlT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|25354750_t3_50 


| |499 


| |24i9 


i r i 


p3 i 




Protein name 








Locus Name 


Acc# 


Description 












|NO-HIT | 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|29332503_i:3_<;6 


II 500 


| | 2 4 2 0 


i p 01 i 


|906 | |797 | 


|3.ie-79 



Protein name 



Locus Name 



enoyl- (acyi- carrier protein) reductase 



gp:AF104262 



Acc# 
AF104262 



Description 



Pseudomonas aeruginosa enoyi- (acyi-carrier protein) 
complete cds . 


reductase ( f abl ) gene , 


NT AA 

ORF Name NT ID AAID , — . , „ — 

Length Length 


Score Probability 


|29335786_t3_46 | |50i | |2421 | |249 | |750 


|428 | 3.9e-40 | 



Protein name 



Locus Name 



unKnown 



|gp:AP116284 



ACC# 
AF116284 



Description 



Pseudomonas aeruginosa DnaJ-like protein gene, complete cds; 


andunknown 


genes . 






ORF Name 


NT AA 
NTID AAID . — . , . — . , Score 
Length Length 


Probability 


29382075_tl_4 


502 2422 312 939 429 


3.0e-40 



Protein name 



Locus Name 



probable membrane protein bl52 0 



pir:C6490<!> 



Acc# 
C64906 



Description 



180 



ORF Name 



31425825 tl 22 



NT ID 



AAID 



NT AA 
Length Length 

TZQ 1 f^n — 



Score Probability 
|77i | |1.7e-7fi 



Protein name 

Description 
EPIMERASEJ (PPEJ (RBP3E) 



Locus Name 



sp:RP E _HAEIN 



Acc# 
P44756 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 




32177_c3_133 


|504 


| |2424 


1 1 64 


i 195 i 






Protein name 








LOCUS 


Name 


Acc# 


Description 














|NO-HIT 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

Length 


Score 


Probability 


3316436_rl_19 


ii 505 


| 2425 


462 | 


|1389 


331 


7.4e-30 | 


Protein name 








Locus 


Name 


Acc# 










sp:VISC_ECOLI 


P25535 


Description 














VISC PROTEIN, 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|33632828_t3_62 


IP* 


| |2425 


IH 4S 1 


|V,0 | 


i tb9 1 


|5-Ie-54 | 



Protein name 



Locus Name 



noose-5-pnospnate isomerase 



] w 



:AF037440 



ACC# 
AF037440 



Description 



Edwards iel la ictaluri D- 3 -phosphogly cerate dehydrogenase (serA) gene, partial 
cds; ribose-5-phosphate isomerase (rpiA) , inhibitorof chromosome initiation 
(iciA) , putative 26 kDa protein (yggE) , putative 30.6 kDa protein (yggB) , and 
fructose 1 , 6-bisphosphatealdolase (fda) genes, complete cds; and 
phosphoglycerate kinase (pgk) gene, partial cds. 



ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|338S3431_t3_53 


J 507 |2427 


430 | 


1293 


456 I 


4.2e-43 



Protein name 



Locus Name 



conserved nypotnetical protein 



bir:F7554S 



Acc# 
F75546 



Description 



181 



ORF Name 



NTID AAID 



35153902 c2 109 



2428 



NT 
ST7 



AA 

— — Score 
Length Length 



Probability 
|B.9e-I03 



Protein name 

Description 
PROBABLE TRANSPORT ATP-B1NDING PRO T EIN M5BA 



Locus Name 



'sp:MSBA_ECOLI 



Acc# 
P27299 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|35350061_c2_98 


ii 509 


|2429 


64 | 


i 195 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


36128378__i3J>7 




| (2430 


1 1 124 i 


|375 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


3912568_cl_92 


511 


2431 


" b 1 


1578 |1470 | 


|l.be-150 



Protein name 



Locus Name 



soluble pyridine nucleotide transhydrogenase I |gp : AP159108 



Acc# 
AF159108 



Description 



Azotobacter vinelandu soluble pyridine nucleotide transhydrogenase (sth) 
gene, complete cds. 



ORF Name 



4111008 r2 33 



NTID 



AAID 



PTT2" 



NT AA 
— — Score 

Length Length 



Probability 
| | 3.2e-20 ~ 



Protein name 



Locus Name 



sp:CSPA_P3EAE 



Acc# 
P95459 



Description 
MAJOR COLD SHOCK PROTEIN CSPA 



182 



ORF Name 



14500892 ci 9i 



Protein name 



Description 



NTID 



AAID 



NT AA 
Length Length 
251 1 \T7Z — 



Score 



Probability 
Oe^ 



Locus Name 



sp:YDlA_ECOLI 



ACC# 

P03822 :P46 
137:P76203 



HYPOTHETICAL 31.2 KB PROTEIN IN PPSA 


-AROH INTERGENIC REGION 




ORF Name NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


5132667_i:i_12 | |514 |2434 


368 |ii07 |124 | 


|6.4e-05 | 


Protein name 


Locus Name 


Acc# 


mannosyi t rans t erase - 1 ike protein 


gp:YPS251712 


AJ251712 



Description 



Yersinia pseudotuberculosis serotype Oilfc nemH gene (partial) 


andO- antigen 


gene cluster for ddhD gene, ddhA gene 


, ddhB gene, ddhCgene, prt gene, wbyH 


gene, wzx gene, wbyl 


gene, wbyJ gene, 


wzygene 


, wbyK gene, gmd gene, fcl gene, 


manC gene, wbyL gene, 


manBgene and wzz gene. 






ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


5859762_c2_120 


513 2435 


115 


i 348 i 




Protein name 






Locus Name 


ACC# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


589450I_ci_82 | 


516 | |2436 


551 


|1656 564 | 


l.be-54 | 


Protein name 






Locus Name 


Acc# 








sp:Y653_HAEIN 


P44029 


Description 










HYPOTHETICAL PROTEIN HI0653 


ORF Name 


NTID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


5993768_c2_122 


517 |2437 


514 


1542 305 


4.6e-24 | 



Protein name 



Locus Name 



sp:OSTA_HAEIN 



Acc# 
P44846 



Description 

ORGANIC SOLVENT TOLERANCE PR OTEIN H0M0L0G PRECURSOR 



183 



ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



|7ib00bl c3 m 



2438 



7JT 



7TT 



TTT 



Probability 
1 . 2e-08 



Protein name 



Locus Name 



hypothetical protein APE2143 



bir:B7252i 



Acc# 
B72521 



Description 



ORF Name 



197705b C3 129 



NTID AAID 

] |5i9 | pmr 



NT AA 
— — Score 
Length Length 



Protein name 



Description 



Locus Name 



sp: YBJE_ECOLI 



Probability 
|2.7e-48 

ACC# 
P75826 



HYPOTHETICAL 34.4 


KD PROTEIN IN P0XB-AQP2 INTErgenIc REGION 




ORF Name 


NT AA 

NTID AAID . . , . . , Score 

Length Length 


Probability 


9882793_c3_i28 


| |520 | |2440 | |55 |198 |109 | 


p.5e-06 | 



Protein name 



Locus Name 



hypothetical protein APE0666 



bir:F72654 



Acc# 
F72654 



Description 
ORF Name 



NTID AAID 



AA 

— Score 
Length Length 



23615951 ci 12 



NT 
n 

TIE 



TST 



Probability 
4.5e-14 



Protein name 



Locus Name 



aanesm complex 2 5K protein precursor :LecA 
protein 



bir:JC5327 



Description 

ORF Name 
|24259425_c3_15 
Protein name 



Acc# 

JC5327:PC4 
312 



NTID AAID 



NT AA 
Length Length 

]E!!= 



Score 



194 



Probability 
2 . 4e-15 



Locus Name 



adhesin complex 2 5K protein precursor : LecA 
protein 



pir : JC5327 



Description 



Acc# 

JC5327:PC4 
312 



184 



ORF Name 



I33986343 El 10 



NTID 



AAID 



2443 



NT AA 
— — Score 

Length Length 

1 12100 



Probability 
2.9e-2S0 



Protein name 



Locus Name 



oligopeptidepermease 



5P0PPDACA 



Acc# 
X89237 



Description 

s. pyogenes UNA tor oppA, oppB, oppc, oppD, oppF, and dacA genes. 



ORF Name 
[4 7 2 7 338_f T^~ 



NTID 



AAID 



NT AA 
Length Length 

3T5 1 nmr 



Score 



Probability 
1.0e-128 



Protein name 



Locus Name 



oligopeptidepermease 



IgpiSPOPPDACA 



Acc# 
X89237 



Description 



S. pyogenes DNA tor oppA, 


OppB , oppC , 


OppD, 


oppF, and dacA genes. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4788508_±3_ii 


|525 






| 192 




Protein name 








Locus Name 


Acc# 


Description 












(NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|60532I2_t3_8 


II 526 


IP 44 * 1 


|340 


| |1023 | J1398 | 


|6.3e-i43 


Protein name 








Locus Name 


Acc# 



EL 



: SPOPPDACA 



Description 



S . pyogenes 


DNA 


tor oppA, 


OppB , OppC , 


oppD, 


oppF , and 


dacA genes . | 


ORF Name 




NTID 


AAID 


NT 
Length 


AA 
Length 


Score Probability 


12265658_C2_ 


101 


[527 


2447 | 


221 


| 


753 |i.4e-74 | 



Protein name 

Description 
DNA POLYMERASE III SUBUNIT GAMMA/ TAU , 



Locus Name 



sp:DP3X_HAE!N 



ACC# 
P43746 



185 



ORF Name 



NTID AAID 



12501562 c2 109 



2448 



NT AA 
Length Length 
[214 



Score 



ST5~ 



Probability 
!i.5e-2I 



Protein name 



Locus Name 



hemolysin- related protein 



] bir:F72325 



Acc# 
F72326 



Description 

ORF Name 
112605253 cl 85 



NTID 



AAID 



NT 



— Score 
Length Length 



■j [2904 | 



Probability 
|1.3e-g5 ~ 



Protein name 



Locus Name 



AccJ 



sp:MLTD_ECOLI 



Description 




1 P23931:P32 

982 :P77350 


(MUREIN HYDROLASE D) (REGULATORY 


PROTEIN DNIR) 




ORF Name NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


1289I082_t3_51 | |530 | |2450 


| |237 | |714 | |234 | 


1.4e-19 | 


Protein name 


Locus Name 


ACC# 



sp:YBHD_ECOLI 



Description 

HYPO T H E TICAL T RANSCRIPTIONAL R E GULA T OR IN M0DC-BI0A IN T ERGENIC R E G I ON 



P52696 :P75 
761 



ORF Name 



NTID 



AAID 



13876010 tl II 



BUT" 



NT AA 
Length Length 

] [ ^~ 



\£UJT 



Score Probability 
] ED 



T.0e-il 



Protein name 



Description 



Locus Name 



sp:RBCR_CHRVI 



ACC# 
P25544 



RUBISCO OPERON 


TRANSCRIPTIONAL REGULATOR 






ORF Name 


NT 

NTID AAID . . , 

Length 


AA 

. — . , Score 
Length 


Probability 


15870706_Cl_68 


| |532 2452 344 


1035 |1009| 


|l.Ie-101 


Protein name 




Locus Name 


Acc# 



sp : LEU2_ECOLI 



Description 

(ISOP&OPYLMALATE ISOMERASE) (ALPHA- I PM I50MERASE) (IPMI) 



P30127:P78 
042 



186 



ORF Name 



NT ID 



AAID 



175062 cl 79 



NT 
Length 
TT9 



AA 

_ — _ Score 
Length 



1740 



Probability 
3 ,4e-73 



Protein name 



Description 



Locus Name 



sp:HPPD_PSESP 



Acc# 
P80064 



4 -HYDROXYPHENYLPYRUVATE D I OXYGENASE , 


(4HPPD) 


(HPD) 






ORF Name 


NT ID AAID 


NT 
Length 


AA 

— , „ Score 
Length 


Probability 


197S9052_ciJ74 


| 534 2454 


131 


|39<S | |294 | 


2.0e 




Protein name 






Locus Name 




Acc# 








sp:SYK_ACICA 




Q43990 


Description 












LYSYL-TRNA SYNTHETASE, ( LYS INE - - TRNA 




(LYSRS) 






ORF Name 


NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|20178438_ci_80 


| |535 | 2455 | 


ttj 


522 529 


|1.9e 


-61 


Protein name 






Locus Name 




Acc# 








sp:HPPD_PSESP 




P80064 


Description 












4 -HYDROXYPHENYL PYRUVATE DIOXYGENASE, 


(4HPPD) 


(HPD) 






ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


21729513_c3_129 


|536 | 2455 | 


61 i 


p« | 






Protein name 






Locus Name 




ACC# 


Description 












NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


21738306_r2_30 


|537 2457 


100 | 


|303 | |185 


4.9e- 


-14 | 


Protein name 






Locus Name 




ACC# 








sp:SECE_HAETN 




P44590 



Description 
PROTEIN -EXPORT MEMBRANE PROTUIN SECP 



187 



ORF Name 
|224437b0 ci TZZ 



NTID AAID 



I53B" 



NT AA 
Length Length 
[2T7T 1 — 



Score Probability 
] |166 | |8.1e-12 ~ 



Protein name 

Description 
HYPOTHETICAL itj.i KB PROTEIN *3LL12b4 



Locus Name 



sp:YC54_3yNY3 



Acc# 
P74078 



ORF Name 



123572128 cl 92 



Protein name 



NT ID 



AAID 



NT 



AA 



— — , Score Probability 
Length Length 



TUT 



|312 | |179 | |6.0e-ia 

Locus Name Acc# 



sp:RADA_PiitlAE 



P96963 



Description 

DNA REPAIR PkOTUlN RADA HOMOLOG (DNA REPAIR PROTEIN SMS HOMOLOG) 



ORF Name NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


23514376_c3_li9 540 


2450 


|312 


|939 | 742 | 


2.1e-73 | 


Protein name 






Locus Name 


Acc# 








sp:EX3_HAEIN 


P44318 


Description 










| EXODEOXYRIBONUCLRASL! 111, 


( EXONUC LEASE III) 


(EXO III) 




ORF Name NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23994182J:i_i7 | 541 


|24<5i 


i r n 


1^ 1 I 175 1 


2.5e-13 | 


Protein name 






Locus Name 


ACC# 


ortl 






| gprPAtmbbU 


U39558 


Description 


Pseudomonas aeruginosa ortl, TolQ 
(tolB) genes, complete cds . 


(tolQ) , TolR (tolR) , TolA 


ItoiA) , ana toib 


ORF Name NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


24276625_c3_122 | 542 


| |2462 


i p» 


|882 | |337 


|i.7e-30 | 



Protein name 



Locus Name 



lsp:YGlPJ400Ll 



ACC# 
P45463 



Description 

HYPOTHE TI CAL TRANSCRIPTIONAL REGULATOR IN BAC A-TTDA IMTERGEN10 REGION 



188 



ORF Name 



NT ID 



AAID 



24406575 ci 69 



NT 
n 

[227 



AA 

— Score 
Length Length 



Probability 
|1.9e-77 



Protein name 



Description 



Locus Name 



sp:LEt)D_AZOVI 



Acc# 
P96196 



(ISOPROPYLMALATE I50MERASE) (ALPHA- I PM IS0MERA5E) 


NT 

ORF Name NT ID AAID — , 

Length 


AA 

. — . , Score 
Length 


Probability 


|24415911_c3J.15 | |544 | |2464 | 97 | 


|294 | |97 | 


|4.6e-05 | 


Protein name 


Locus Name 


Acc# 


outer membrane protein H.8 precursor 


pir :S04157 


S04157 


Description 






ORF Name NT ID AAID _ — , 

Length 


AA 

_ — . . Score 
Length 


Probability 


24417077_c3_12i 545 | |2465 555 


1668 229 


9.9e-16 | 



Protein name 



Description 



Locus Name 



sp:DE>3X_HAEIN 



ACC# 
P43746 



DNA POLYMERASE 


III SUBUNIT GAMMA/ TAU, 






ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


2558450i_c2_110 


| 546 | |2466 


230 | 


I 693 1 




Protein name 






Locus Name 


ACC# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|30i98405_c2_i00 


| 547 | |2467 


1 417 1 


1254 | |1913 | 


I.Ve-197 


Protein name 






Locus Name 


Acc# 



sp:SYK_ACICA 



Q43990 



Description 

LYSYL-TRNA SYN T HE T ASE, (LYSINE- -TUNA L I GA5E) (LYSRS) 



189 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


34406268_cl_70 


548 




169 | 


i 510 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|3912568_c2_105 


|549 


| |2469 


i r i 


|1404 |S19 | 


1.4e-81 


Protein name 








Locus Name 


Acc# 



sp:NHAC_BACFI 



P27611 



Description 
NAt+)/H(+) ANTlPOkTEk 



ORF Name 
p95!n77_c3_11 7 



NT ID 



AAID 



NT AA 

— — Score 

Length Length 

1 



Protein name 



Description 



F 7 I EZD 



Locus Name 



Probability 
3.1e-51 



sp:LEU2_CANMA 



Acc# 
Q00464 



ISOMERASE) ( ALPHA - 


IPM ISOMERAyt!) 


(IPMI) 








ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


p9888i:4_rl_4 


| |551 | |2471 


| 625 


|1887 | 


|1391 | 


p.be-142 | 



Protein name 



Locus Name 



general protein secretion pathway subunit 
SecD 



El 



AF179925 



ACC# 
AF179925 



Description 



CitroDacter treunan general protein secretion pathway suJDunit secDgene, 
complete cds . 


ORF Name 


NT AA 
NT ID AAID — — Score 
Length Leny Lh 


Probability 


4314068_c3_li8 


552 |2472 | 359 |I080 | |1356 | 


1.8e-138 


Protein name 
Description 


Locus Name 
sp:LEU3_NEILA 


Acc# 
P50180 



190 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4335328_c2_98 


ii 553 


| |2473 


ip i 


|189 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length — - 


Probability 


448763B_ti_I 


||5b4 


| |2474 


i r j i 


|I842 | |1885 | 


|1.2e-i94 | 


Protein name 








Locus Name 


Acc# 










|sp:PPCK_CHLLI 


Q08262 


Description 












( PHOSPHOENOLPYRUVATE CARBOXYLASE) 


(PEPCK) 




1 
1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Pfobabi 1 ifcv 

.t j_ *-/ c*r x^s _i_ _i_ _i_ y 


4771925_c2_94 


| 555 


| |2475 


| 207 


P 4 | pjl | 


|7.4e-30 J 


Protein name 








Locus Name 


Acc# 










sp:RUVA_P3EAE 


Q51425 


Description 












HOLLIDAY JUNCTION 


DNA HELICASE RUVA 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|48S5427_t3_6I 


| 556 


| 2476 


1 P yy 1 


|870 | |247 


S.9e-21 | 


Protein name 








Locus Name 


Acc# 


nypotneticai protein 






pir:S7B23S 


S75235 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|488I338_ti_5 


II 557 


| |2477 


i p uv i 


|864 | |420 | 


|2.7e-39 | 



Protein name 

Description 
PRO T EIN- E XPOR T MEMBRANE PROT E IN 3ECE 



Locus Name 



sp:5ECF_HAEIN 



Acc# 
P44590 



191 



ORF Name 



NTID 



AAID 



509446J ±2 28 



NT AA 
Length Length 
ITT 



] [ 



Score Probability 
] |240 | |j.2e-20 



Protein name 



Locus Name 



sp:YAJC_ECOLI 



Acc# 
P19677 



Description 

HYPO T HETICAL 11.9 KB PROTEIN IN TS T -SECD 1MTKRS E HIC R E GION (0RP12) 



ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|5Iii588_c3_116 | |559 | |2479 


i p« 


1047 | |14SS | 


|3.9e-150 | 


Protein name 




Locus Name 


Acc# 


tructose-1, 6-bisphosphate aldolase 




|gp:PST01i927 


AJ011927 


Description 


Pseudomonas stutzeri rda gene and 


gene encoding hypotheticalprotem. 1 


ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


978400_ci_83 560 | |2480 


| 387 


|11S4 pi - 1 


|4.4e-48 | 



Protein name 



Locus Name 



penicillin-binding protein 4 



[gp:AFi56692 



Acc# 
AF156692 



Description 

Neisseria gonorrhoeae penicillin-binding protein 4 (pt>p4) gene , complete cds . | 



ORF Name 
|i053753_t3_63 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



Probability 



Protein name 



] [ 551 | [2481 | |588 | [1757 | |801 | |i.2e-79 ~ 

Locus Name Acc# 



putative membrane protein 



:AF150928 



AF150928 



Description 



Acinetobacter sp. AD PI BenP (benP) and AreR (areR) genes, completecds; are 
operon, complete sequence; SalD (salD) , and SalE (salE)genes, complete cds; 
SalR (salR) , SalA (salA) , putative membraneprotein, putative 2-component 
regulatory protein, putativehistidine kinase of 2-component regulatory 
system, and carbonicanhydrase homolog genes, complete cds; and 
dihydropyrimidinasehomolog gene, partial cds. 



ORF Name 



NTID 



AAID 



11058425 c2 108 



NT AA 
— — Score 

Length Length 



Probability 
!1.2e-27 



Protein name 



Locus Name 



ribosomal protein S18 
Description 



pir :E64076 



Acc# 
E64076 



192 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|TT03450 fl 5 


563 


2483 


530 


1593 |1297 | 


|3 .2e 


-132 


Protein name 










Locus Name 




Acc# 












sp: YB2X_HAEIN 


086233 


Description 
















HYPOTHETICAL 


PROTEIN HI1126.1 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Probability 


|12271925 cl 71 


| 564 


| |2484 


1 1143 
1 1 


|432 | |197 | 


pr.2e 


-15 | 


Protein name 










Locus Name 




Acc# 












sp:YFFB_HAEIN 




P44515 


Description 
















HYPOTHETICAL 


PROTEIN HI 010 3 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


(15535930 f3 61 


|565 | 2485 


1373 1 
1 1 


[1122 | 842 | 


|5.2e 


-84 


Protein name 










Locus Name 




ACC# 












sp:QUEA_ECOLI 




P21516 


Description 
















(QUEUOSINE BIOSYNTHESIS PROTEIN QUE A) 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— „ Score 
Length 


Probability 


|16134657J:2_a4 


| 566 


| |2486 


1 1 


|1104 | |895 | 


|1.3e 


-89 


Protein name 










Locus Name 




Acc# 












sp:GCST_ECOLI 




P27248 


Description 
















| PROTEIN) 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


|197211_cl_9i 


| 567 


| |2487 


240 


723 |304 | 


|5.4e 


-27 


Protein name 










Locus Name 




ACC# 


hypothetical protein 






| gp : ACRBDOXN 




Z46863 



Description 



Acinetobacter sp. cysD, cobQ, sodM, lysS, rubA, rubB, estB, oxyR,ppJc, mtgA, 
0RF2 and ORF3 genes. 



193 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|20890660_t3_56 


568 


2488 


121 1 


P" 1 






Protein name 










Locus Name 




ACC# 


Description 
















[NO-HIT 




NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|20915682_rl_l 




| |2489 


i i 


|429 | 162 


[6 . oe 


-12 


Protein name 










Locus Name 




ACC# 












sp:YIBN_ECOLI 


P37688 


Description 
















HYPOTHETICAL 


15.6 KD PROTEIN IN . 


3ECB-TDH INTERGEN1C REGION 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


21759556_tl_8 


|570 


| 2490 


i r i 


1776 | |630 | 


|1.6e 


-103 


Protein name 










Locus Name 




Acc# 


| Na(+) : solute 


symporter (Sst tamiiy; 


pir :E704bO 


E70480 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


22353380_rl_7 


i r 1 


1 2491 


89 


1270 






Protein name 










Locus Name 




Acc# 


Description 
















pO-HTT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — , , Score 
Length 


Probability 


|23438426_t2_26 


||572 


| |2492 


1 1 


|2886 |2873 | 


p.le 


-299 


Protein name 










Locus Name 




Acc# 












sp:GCSP_ECOLI 




P33195 



Description 

I DECARBOXYLASE) (GLYCINE CLEAVAGE SYSTEM P-&R0TE1N) 



194 



ORF Name 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



23444531 r2 25 



57T 



2493 



Probability 
2 . 5e-3B 



Protein name 



Locus Name 



glycine cleavage system protein 
H : aminomethyl carrier protein : glycine 
decarboxylase complex protein H 



pir : A56623 



Acc# 

A56623 :S36 
833 :B56689 



Description 










:I41231 :H6 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|2439i340_r3_47 


| 574 


|2494 


i p 


I 249 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24397200_t2_32 


1 F b 


|2495 


1 l 4iU 


|1233 | |1330 | 


|1.0e-135 


Protein name 








Locus Name 


Acc# 



Description 



jsp : TGTT 



HAEIN 



P44594 



| TRAN3GLYC05 YLASE ) 


(GUANINE INSERTION ENZYME) 






ORF Name 


NT 

NTID AAID . — , , 
Length 


AA 

. — . , Score 
Length 


Probability 


|25400263_c3_I17 


| |b76 | |2495 | |399 | 


|1200 | 725 | 




Protein name 




Locus Name 


Acc# 



Description 



sp:YCAB_P5 E PR 



P72190 



HYPOTHETICAL 30.2 KD PROTEIN IN 


CAPS 3 1 REGION 




ORF Name NTID AAID 


NT AA 
. — . — . , Score 
Length Length 


Probability 


|255527<52_r2_IS | 577 |2497 


JI09 |330 | |199 | 


|7.2e-16 


Protein name 


Locus Name 


ACC# 


glutaredoxm 3 (grxci) RP2 04 


pir:P71731 


] F71731 



Description 



195 



ORF Name 



NT ID 



AAID 



I2562&452 c3 125 



NT AA 
Length Length 
TUG 1 \TIT 



Score 



Im- 



probability 
1.9e-08 



Protein name 



Description 



Locus Name 



sprYCGL^ECOLI 



ACC# 
P76003 



HYPOTHETICAL 12.4 


KB PROTEIN IN MIHC-SHKA INTERGENIC REGION 




ORF Name 


NT AA 
NTID AAID . — . , _ — . , Score 
Length Length 


Probability 


282550_c3_ii0 


|579 2499 | 316 |9S1 |127 | 


2.3e-05 



Protein name 



Locus Name 



hypothetical protein 



gp:SPR236923 



Acc# 
AJ236923 



Description 



Shewanella 


trigiclimarina 


itcA gene 


ana ORF2 


(partial) and orfi. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

m — Score Probability 
Length - 


29314057_cl_ 


90 580 


2500 


289 


|870 | |705 | |i.7e-69 | 



Protein name 



Locus Name 



probable ion transporter 



pir :E75470 



Acc# 
E75470 



Description 



ORF Name 



NTID 



AAID 



NT 



AA 



Length Length 



Score 



29333458 £2 39 



Probability 
1432 | |195 | |2.7e-14 



Protein name 



Locus Name 



sp:SYL_SYNY3 



Acc# 
P73274 



Description 

LEUCYL-TRNA SYNTHETASE, (LEUCINE- -TRNA LIPASE) (LEURS) 



ORF Name 



NTID 



AAID 



30100432 t3 66 



NT 
n 

T5T 



AA 

— Score 
Length Length 



Probability 
|2.3e-05 



Protein name 

Description 
DNA POLYMERASE TTT^ DELTA SUBUNIT, 



Locus Name 



sp:H0LA_EC0LI 



Acc# 
P28630 



196 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Probability 


|3298257_11_15 


583 


2503 


179 


537 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


35637_t2_30 


ir 4 


| |2504 


1 1 1 " i 


|498 | |146 


|3.0e-10 



Protein name 



Locus Name 



unknown 



EL 



:AF064^27 



Acc# 
AF064527 



Description 

Rhodocista centenaria PPH (ppn) gene, complete cds; and unknowngenes . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


390778I_c3_126 




| J2505 


1 I 170 1 






Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


3925443_c2_107 


586 


2505 


1 1 134 


1411 1 369 


6.9e-34 | 



Protein name 



Description 



Locus Name 



sp:RS6_KCOLI 



Acc# 
P02358 



3 OS RIBOSOMAL 


PROTEIN SS 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length - - 


Probability 


4003558_tl_2 


1 F' 


|2507 


i »' i 


|474 | |508 | 


|1.3e-48 | 


Protein name 








Locus Name 


ACC# 



|sp:DUT_EgoTr 



P06968 



Description 
(DUTPA5E) (DUTP PYROPHOSPHATASE) 



197 



ORF Name 



I432844J E3 4J 



NTID 



AAID 



[2^0T 



NT 



AA 

— , Score 
Length Length 



H5B" 



Probability 
2 . Oe-36 



Protein name 

Description 
PRO TE IN - E XPOR T P ROTEIN SE^B 



Locus Name 
sp : SECB_E(^OLI 



Acc# 
P15040 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


4860762_t3_64 


| 589 


|2509 


| |268 


|807 | 






Protein name 








Locus Name 


Acc# 


Description 














NO-HIT J 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


4860943_ci_yy 


| 590 


| |2510 


|185 




199 


|7.2e-l£> | 



Protein name 



Locus Name 



NADPH : qui none oxidoreductase 



gp:AF14S2J4 



Acc# 
AF145234 



Description 

Arabidopsis thaliana NADPH : quinone oxidoreauctase (NQR) mRNA, complete cas 

TvTT" A 7V 



ORF Name 



14897050 t3 44 



NTID 



AAID 



AA 

— , Score 
Length Length 



][ 



KIT 



NT 

■n 



Probability 



Protein name 



| 462 | p5~[ |2.1e-32 ~ 
Locus Name Acc# 



acetyiglutamate Kinase 



bir:D70477 



D70477 



Description 



ORF Name 



NTID 



AAID 



150160 ci '/b 



NT 
2n 
[TT7 



AA 

— Score 
Length Length 



[12 7 | |384 | pu~[ [ 



Probability 
1.9e-42 



Protein name 



Locus Name 



haemoglobin- Haptoglobin JDinamg protein HnuA | |gp : HIU43198 



Acc# 
U43198 



Description 

Haemophilus influenzae naemogiobin- Haptoglobin binding protein HnuA(nnuA) 
gene, complete cds. 



198 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 


Probability 


6650718_r2_40 


593 


| 2513 


82 


249 




Protein name 








Locus Name 


Acc# 


Description 












no-Hit 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


68221b2_c3_I28 


| 594 


|2514 


1 l"» 1 


|483 | |438 


|3.4e-41 


Protein name 








Locus Name 


Acc# 










• sp:RLy_ECOLI 


| P02418 


Description 












50« RIUOUOMAL 


PROTEIN L9 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


783426_ri_15 


595 


i p 515 


766 


|230I |2022| 


|4.8e-209 



Protein name 



Locus Name 



sp:SYL_ECOLI 



Description 

LEOCVL-TRNA SVM ' HETASE, (LEUCINE- -TRNA LIGA3E) (LEURS) 



Acc# 

P07813 :P78 
292 :P77110 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


860300_r2_29 | 


596 


|2516 




258 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


9803127_cl_88 | 


597 


|2517 


321 


| |bJ | 


J0.041 


Protein name 








Locus Name 


ACC# 


hypothetical protein (bpi 


3' region) 




[pir:C37397 


| C37397 



Description 



199 



ORF Name 



c2 101 



Protein name 



Description 



NT ID AAID 



] i 



NT AA 

— , — Score 

Length Length 

PT7 1 11584 I [ST5TT" 



Probability 
|1.3e-88 



Locus Name 



sp:YF6 7_HAEIN 



ACC# 

Q57408:P96 
344 



PROBABLfc! TONB- 


-DEPENDENT RECUM'Ok 


HI1557 PRECURSOR 




ORF Name 


NT ID AAID 


NT AA 
— — Score 
Length Length 


Probability 


1040887_ci_72 


|599 [2519 


301 |90S | |60i | 


|1.8e-58 | 


Protein name 




Locus Name 


Acc# 



Description 



gp:AB025342 



AB025342 



Moriteila marina genes, complete cds, similar to eicosapentaenoicacid. 
synthesis gene cluster. 



ORF Name 
[10646402 J:3_46 



NT ID 



AAID 



[J57TT 



NT AA 
Length Length 
755 — 



— Score Probability 



Protein name 



[1188 | [953 | |5.0e-96 ~ 
Locus Name Acc# 



sp:AR0P_EC0LI 



P00888 



Description 

SYNTHETAS E ) ( 3 - DEOX Y - D - ARAB I NO - H E PTUL050NA TE 7-PHO^PHA TE SYNTHASE) 



ORF Name 


NT ID 


AAID 


NT 
Length 


. — . , Score 
Length 


Probability 


10723543_fc3_48 


| 501 


2521 


73 


|222 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


I09690B7J:3_60 


502 


2522 


1242 


|3729 | |2415 | 


p.le-285 



Protein name 



Locus Name 



DNA polymerase III 



gp:AF062919 



Acc# 
AF062919 



Description 

Pseudomonas tiuorescens DNA polymerase III (ctnaE) gene, compietecas . 



200 



ORF Name 



119552252 c3 122 



Protein name 



Description 



NTID 



AAID 



NT AA 
Length Length 

TIT — 



Score 



Locus Name 



lgp:D90SS3 



Probability 
5.3e-20 



Acc# 

D90863 :ABO 
01340 



E.coli genomic DNA, 


Konara clone 


#407(52.4- 


52 . 8 mm. ) . 




ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


20180387_c2_104 


|604 |2524 


| 208 


1 P 7 1 I 117 1 


|i.0e-05 | 


Protein name 






Locus Name 


Acc# 



sp:Y3S5_HAEIN 



P43988 



Description 
HYPOTHETICAL PROTEIN HI0366 PRECURSOR 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — Score 
Length 


Probability 


|20355003_t2_25 


605 


2525 


173 | 


522 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|20485501_c3_115 




|2526 


| 154 


|465 |239 | 


4 . le-20 


Protein name 








Locus Name 


Acc# 


nypotneticai protein PH0336 






pir :E71140 


E71140 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


2120263_c3_114 


1 I 507 


|2527 


200 


|603 | |217 | 


B.9e-i8 | 



Protein name 

Description 
I (P286) 



Locus Name 



sp:VGGB_ECOLI 



Acc# 
P11666 



201 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — , Score 
Length 


Probability 


22131925_c3_124 | 


608 


2528 




1164 |I176 | 


|2.Xe 


-119 


Protein name 










Locus Name 




Acc# 


AarC 




gp:PSU67933 


U67933 


Description 
















Providencia stuartu AarC 


laarC) 


gene, complete 


cds . 




i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|24219200_r3_59 | 


609 


2529 


413 | 


|1242 | |872 | 


|3.5e- 


-87 | 


Protein name 










Locus Name 




Acc# 



sp:YCFD_HAEIN 



P44683 



Description 
HYPOTHETICAL PROTEIN HI0396 



ORF Name 



NTID AAID 



124412678 t2 27 



2530 




AA 

T — ^ Score 
Length 



ITT 



Protein name 

Description 
RIB0NUCLEA5E HI I, (RNASE HI I) ( FRAGMENT ) 



Locus Name 



sp:RNH2_VIBCH 



Probability 
1.3e-25 

Acc# 
P52021 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length - - - 


Probability 


24423250_t3_53 


511 


2531 


" i 


P 04 | 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


25573802_ti_4 


612 


2532 


1 Ub 1 


|1341 | 456 | 


4.2e-43 | 



Protein name 



Locus Name 



lipid-A-clisaccharicle synthase, 



pir :E64180 



ACC# 
E64180 



Description 



202 



ORF Name 



NTID AAID 



AA 

— , Score 
Length Length 



129407800 cl 80 



NT 
n 



Probability 



[1212 | 



TT25" 



5.4e-114 



Protein name 



Locus Name 



Sp:YFGB_PSEAE 



Description 

HYPOTHETICAL 41. 7 KB PROT EI N IN PIL F-NDK 1NTERGENIC REGION (0RF1) 



Acc# 

Q51385:Q51 
525 



ORF Name 



NTID 



AAID 



NT AA 
— , — Score 
Length Length 



Probability 



3145438 ci 69 



Protein name 



| 514 | [2534 | |485 | [1458 | p^"| |i.3e-66 

Locus Name Acc# 



unknown 



gp:AP003741 



] 



AF003741 



Description 



Eschericnia coli 


CFT073 patnogenicity lsiana 


gene, 


complete 


eels . 


ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


33882815_cl_79 


515 2535 


273 | 


|822 


| 243 


|i.2e-4i | 



Protein name 



Description 



Locus Name 



sp:YFCB_ECOLI 



Acc# 

P39199:P78 
252 :P76939 



(EC 2.1.1.72) 



ORF Name 


NT 

NTID AAID _ — ^ 
Length 


AA 

T — , i Score 
Length 


Probability 


3906293_ri_3 


616 | |2536 | no 


|333 | 146 | 


|3.0e-10 


Protein name 




Locus Name 


Acc# 






sp:YDAL_t!COLl 


| P76053 


Description 








HYPOTHETICAL 21 


5 KD PROTEIN IN OGT-DBPA INTERGEN1C REGION 




ORF Name 


NT 

NTID AAID L , 

Length 


AA 

— , Score 
Length 


Probability 


3907568_c2_105 


| |SI7 | 253V | 395 


|1188 |385 | 


|l.le-35 



Protein name 



Locus Name 



sp:YFGL_ECOLl 



Acc# 
P77774 



Description 

HYPO T HETICAL 41 . 9 KD PROTEIN IN X^EA-HISS 1NTUR GENIC REGION 



203 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 




1 1618 
1 I 618 


|2 53 8 


425 " 


1278 


|1077 | 


|6.6e 


-109 


Protein name 










Locus 


Name 




Acc# 












sp : SYHJECOLI 


P04804 


Description 






















ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|3945293_c3_113 




|2539 


|328 


|987 | 


696 


1.5e- 


-68 | 



Protein name 



Description 



Locus Name 



sp:S0HB_HAEIN 



Acc# 
P45315 



POSSIBLE PROTEAN 


50HB, 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|3S46852_c3_i26 


| po 


| |2540 


280 | 


|843 | 


164 


6.3e-12 | 



Protein name 



Description 



Locus Name 



sp:Y370_HAEIN 



ACC# 
P43989 



HYPOTHETICAL 


PROTEIN HI0370 










ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|39469i7_t3_64 


| |52i |2541 


205 | 


I s18 1 


i" 8 1 


|i.3e-48 | 



Protein name 



Description 



Locus Name 



sp:3MGA_HAEIN 



Acc# 
P44321 



GLYCOSIDASE) (TAG) 


ORF Name NT ID AAID 


NT 
Length 


AA 

r — , i Score 
Length 


Probability 


|4148383_ci_83 | |522 |2542 


321 


\966 | 557 


|8.3e-54 | 


Protein name 






Locus Name 


Acc# 


hypothetical protein HP0852 


pir :D64626 


D64626 



Description 



204 



ORF Name 



NTID AAID 



14181537 ci 88 



Z7T 



NT AA 

— — Score 

Length Length 

573 1 11422 



Probability 
2.2e-144 



Protein name 



Locus Name 



sp:YFGK_ECOLI 



Acc# 
P77254 



Description 

HYPO T HETICAL STP-BIMD I NO PROTEIN IN XSEA-HI3S INTERGENIC REGION 



ORF Name 
|4460938_t23TT 



NTID 



AAID 



NT 
n 
E7T 



AA 

— Score 
Length Length 



] s 



Protein name 



Locus Name 



O-acetylsenne synthase 



gp:AF010139 



Probability 
| 3.8e-55 

ACC# 
I AF010139 



Description 



Azoto£>acter vinelandii iron-sultur cluster assembly gene cluster , sunB, 
cysE2, iscS, iscU, iscA, hscB, hscA and fdx genes completecds; ndk gene, 
partial cds. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


5114700_ci_86 


| 625 


i p 545 


| 77 


P 34 1 




Protein name 








Locus Name 


Acc# 


Description 












pO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . . Score 
Length 


Probability 


|52138_cl_9I 


| 626 


|2545 


109 | 


330 199 | 


|i.2e-15 



Protein name 



Locus Name 



solanesyl diphosphate syntnase 



gp:AB001997 



Acc# 
AB001997 



Description 

Rhodobacter capsulatus DNA tor solanesyl diphosphate synthase , complete cds. 



ORF Name 



16140580 Tl 36 



Protein name 



NTID AAID 



SIT 



]E 



2547 



NT AA 
Length Length 

] zzn 



7UU 



Score Probability 
11 87 | | i.9e-29 



Locus Name 



hypothetical protein £2532 



tpir:C65030 



Acc# 
C65030 



Description 



205 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|S48425_ci_90 | 


628 


2548 


r i 


1254 1 






Protein name 










Locus Name 




Acc# 


Description 
















MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|10378_c2_i84 | 


|629 


| |2549 


i r i 


|258 | |251 | 


|2 .2e 


-21 


Protein name 










Locus Name 




Acc# 


cold snocJc protein, 


CSPA 








gp:VCC5PA 


Y11908 


Description 














v.cnolerae cspA gene. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


1063510_C2_198 | 




| |2550 


| 175 


|528 |208 | 


|1.5e 


-15 


Protein name 










Locus Name 




Acc# 


uridylyl transferase 




gp:AB024601 




AB024601 



Description 



Pseudomonas aeruginosa dapD gene tor 
tetrahydrodipicolinateN-succinyletransf erase, 



complete cds, strain PAOl . 



ORF Name 
|il75012_cl_174 
Protein name 



NTID AAID 



NT 
Length 

pm 



AA 

— , Score Probability 
Length 



T7JE- 



3 . 2e-93 



Locus Name 



acetate kinase 



|pir:B75254 



Acc# 
B75254 



Description 



ORF Name 



111988812 ci 171 



Protein name 



Description 



'O-HIT 



NTID AAID 



NT 
Length 

prcr — 



— , Score Probability 



AA 
Length 



Locus Name 



ACC# 



206 



J- 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Probability 


12532562_c3__232 


|633 


2553 


299 


900 




Protein name 








Locus Name 


Acc# 


Description 












psro-Hix 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|1254200b_r3_130 


| 634 


| p.M 


P 1 


|279 | 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|13001052_ri_50 


| 635 


[2555 


|62 


|189 | 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


13852337_c2_202 


| 636 


| |2556 


351 


1056 515 | 


2.3e-49 


Protein name 








Locus Name 


Acc# 



Description 



sp:AE>BE_HAEIN 



P44550 



1 THIAMINE BIOSYNTHESIS LIPOPROTEIN 


APBE PRECURSOR 




ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|14225300_c3_224 


| 637 |2557 


| 123 


|372 | |440 | 


|2.ie-4i 


Protein name 






Locus Name 


ACC# 


PH-protein 


|gp:AVU91902 


| U91902 



Description 



Azotobacter vinelandu PI I -protein (glnB) and methylammoniumtransport 
protein (amtB) genes, complete cds . 



207 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ - — . , Score 
Length 


Probability 


|15&28127_t3_9B 


638 


2558 


72 


219 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|15630192_c3_217 


1 P 9 


|2559 


162 | 


[489 | [270 | 


|2.0e-25 


Protein name 








Locus Name 


Acc# 



Description 



| sp:UP04_K eocr 



P39169 :P76 
624:P77022 
:P77023 



UNKNOWN "PROTEIN 


FROM 2D- PAGE (SPOT 


LM6J 






ORF Name 


NTID AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


16597790_t3_131 


|640 |25<S0 


pu | 


1062 232 | 


2.6e-I9 


Protein name 






Locus Name 


ACC# 



Description 



sp:NUCi_CUNEE 



P81203 



NUCLEASE CI, 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


166034iiJ:2_8!> 


|641 


| [2561 


319 | 


I 960 1 


597 | 


|4.8e-58 | 



Protein name 
Description 

I PUTATIVE 2-HYDkOXYACID DEHYDROGENASE Hllbbfe 



Locus Name 



sp:YP36_HAElN 



Acc# 
P45250 



ORF Name 
17036428 E3 138 



NTID 



AAID 



— , . — , Score 



2562 



NT 
Length 



AA 
Length 



Protein name 
Description 

{SO-HIT 



Locus Name 



Probability 



Acc# 



208 



ORF Name 


NTID 


AAID 


NT 
Length 


- — . , Score 
Length 


Probability 


19564458_t2_55 


543 


2563 


212 


539 |192 | 


|4 .Oe 


-15 


Protein name 








Locus Name 




Acc# 


probable glpG protein 






pir :D71258 




D71258 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|21766930_t3_100 | 


544 


| |2564 


1 1510 1 

1 l b±u \ 


11833 1 1554 1 
1 | | 


|1.7e- 


1 


Protein name 








Locus Name 




Acc# 


hypothetical protein 






[pir:S75944 




S75944 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


22035932_c2_209 


645 


1 12555 
1 1 


1415 1 
1 1 


1251 | |515 | 


2 .3e- 


-49 | 


Protein name 








Locus Name 




ACC# 


B1306.06C protein 








gp:MLB1305 




Y13803 


Description 














Mycobacterium leprae cosmid B13 06 


DNA. 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


22775251_cl_144 


646 


2566 


543 


1632 | (1799 | 


2.0e- 


-185 



Protein name 



Locus Name 



acetolactate synthase, III large 
chain: ace tohydroxy- acid synthase III large 
chain 



pir:VCEC3I 



Description 



ORF Name 



Acc# 

E64729:S14 
385 :S40590 
:A01113:I4 



NTID 



AAID 



22890836 c2 214 



[2"5F7~ 



NT 
Length 
|179 



AA 

t — ^ Score 
Length 



Probability 



Protein name 



Description 



|540 | [415 [ [l.fle-41 

Locus Name Acc# 



EE! 



AHU56832 



U56832 



Aeromonas hydrophila FK506 binding protein 
kb fragment . 



(tJcpA) gene, completecds in 3 . 9 



209 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|23595787_c3_220 




2568 


150 


453 






Protein name 










Locus Name 




Acc# 


Description 
















piO-HIT 


UKr iNcime 


In 1 ±U 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


123718750 t2 93 
1 - - 


J L.- _ 


| |2569 




609 432 


11. 5e 


-40 


Protein name 










Locus Name 




Acc# 












sp:RUVC_HAEIN 


P44633 


Description 
















| JUNCTION NUCLEASE 


RUVC) 


(HOLLIDAY JUCTION RESOLVASE RUVC) 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length - - 


Probability 


23727181_c2_215 


II 5 " 


2570 


J iJ0 J 


|993 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|23939008_ci_161 




| 2571 


1 12J 1 


1372 1 141 1 


l.Oe- 


-09 | 


Protein name 










Locus Name 




Acc# 


1 hypothetical protein 






pir:T10511 




T10511 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|23942567_c2_212 


|| S52 


| |2572 


1 1 66 1 


poi 1 






Protein name 










Locus Name 




Acc# 



Description 
[MO-HIT 



210 



ORF Name 


NTID 


AA1JJ 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


23963325_r3_123 


653 


2573 


403 1 
1 


1212 |879 | 


|6.3e-88 


Protein name 










Locus Name 


Acc# 












sp:FADH_ECOLI 


P42593 


Description 














A REDUCTASE) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|24353427_c2_200 


|| 65 4 


2574 


545 


|1641 |2074 | 


|1.5e-214 | 


Protein name 










Locus Name 


Acc# 



Description 



sp:CH60_YEREN 



P48219 



60) { CROSS -REACTING 


PROTEIN ANTIGEN) 








ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|24612761_c3_246 


655 2575 




|780 | 467 


2.9e-44 


Protein name 






Locus Name 


ACC# 



Description 



sp:YMFC_HAEIN 



P44827 



HYPOTHETICAL PROTEIN HI0694 1 


ORF Name NTID AAID 


NT 
Length 


AA 
Length 


Probability 


|24648387_ci_i70 J |656 2576 


i p m i 


p 4s i p i 


6.6e 


1 


Protein name 




Locus Name 




Acc# 


tnymidyiate syntnase 


|gp:L78665 


L78665 



Description 



Methylobacillus tlagellatum aspartate ammotranst erase (aat) , membrane 
protein (orf-1) , homoserine dehydrogenase (horn) , andthreonine synthase (thrC) 
thymidylate sythase (thyA) genes , complete cds . 



NTID 



[51T7~ 



AAID 



NT 

in 



ORF Name 
|24736685_c2_i97 
Protein name 



Description 

METHIONINE AM I NO PEPTIDASE , (MAP) (PEPTIDASE M) 



AA 

T — *-u Score 
Length Length 



Probability 
37H~] |i.2e-75 ~ 



Locus Name 



sp:AMPM_ECOLI 



Acc# 
P07906 



i 



211 



ORF Name 



125390711 cl 153 



Protein name 



Description 



NTID 



AAID 



NT AA 
Length Length 
1 11245 



Score 



Probability 
2.8e-60 



Locus Name 



sp:MUTY_ECOLI 



Acc# 
P17802 



A/G-SPECIFIC ADENINE GLYCOSYLATE, I 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|25412907_11_37 


| |659 | |2579 


1 P 1 


|288 | |74 | 


|0.044 


Protein name 






Locus Name 


Acc# 


hypothetical protein 




|gp:AP000363 


AP000363 


Description 


Bacteriophage VT2- 


-Sa, complete genome sequence. 




ORF Name 


NTID AAID 


NT 
Length 


AA 

„ — L , Score 
Length 


Probability 


2556S577_i:i_23 


|660 2580 


** 1 


|252 




Protein name 






Locus Name 


Acc# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


25585263_r2_64 


|661 | 2581 


1 an 1 


|93S | |790 | 


|1.7e-78 | 


Protein name 






Locus Name 


Acc# 


diaminopimeiate epimerase, 




1 pir:S01913 




Description 








1 B65185:S30 

699:S01913 
:A37841:S2 


ORF Name 


NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|26213386_c3_242 


| |662 | |2582 


1 404 1 


|1215 | |393 | 


|2.0e-36 



Protein name 

Description 
UBIH PROTEIN, 



Locus Name 



sp:UBIH_ECOLI 



Acc# 
P25534 



212 



ORF Name 



NTID 



AAID 



126214429 c2 201 



7FF3" 



NT 
Length 




AA 

_ — Score 
Length 

[7TT~ 



Protein name 



Description 



Locus Name 



Probability 
|S.7e-71 

Acc# 



sp:LGT_SALTY 



Q07293 



PROLIPOPkOTEIN DlACVmLVCt ! tt«L TUAN^kA^, 



ORF Name 



29303b7B t2 74 



Protein name 



Description 



NTID 



AAID 



— — Score Probability 



NT 
Length 

] I!! " 



AA 
Length 
[2TTT 



Locus Name 



Acc# 



NO-HIT 



ORF Name 



29380307 c2 199 



Protein name 

Description 

NO-HIT 



NTID 



AAID 



665 



NT 
Length 



AA 

t — _v. Score 
Length 



■] |258b | |166 | |501 | 



Locus Name 



Probability 



Acc# 



ORF Name 



30258885 c3 234 



Protein name 
Description 



NTID 



AAID 



NT 
Length 
53 



AA 

t — *_v. Score 
Length 



TS2~ 



Locus Name 



Probability 



Acc# 



[NO-HIT 



ORF Name 



33526535 ti bl 



Protein name 
Description 



NTID 
] E 7 = 



AAID 



E 



NT 
Length 

] EE= 



AA 

T — _^ Score 
Length 



1 D 



Locus Name 



Probability 



Acc# 



[NO-HIT 



213 



ORF Name 



NT ID AAID 



NT AA 

— Score 

Length Length 



Probability 



|344I607S_c3_23I 


668 


2588 


803 


2412 |1247 | 


16. 3e 


-127 


Protein name 










Locus Name 




Acc# 












sp:NFRX_AZOV:£ 




P36223 


Description 
















TRANSFERASE ) (URIDYLYL REMOVING 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


35425918_c3_241 




| |2589 


1 I 176 1 


pi | jjoa | 


[8T7e 


-27 | 


Protein name 










Locus Name 




Acc# 


dinydrotolate reductase, 




pir:S52336 




| S52336 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

Length SC ° re 


Probability 


3605i06iJ:2J78 




j |2590 


327 | 


|984 | |930 | 


|2.5e 


-93 


Protein name 










Locus Name 




Acc# 


probable 2 




pir :G70875 


G70875 



Description 



ORF Name 



NTID 



AAID 



13910943 ci 146 



6TT 



2591 



NT 

n 



AA 

— Score 
Length Length 



TUJT 



Probability 
[1278 | |3.3e-130 ~ 



Protein name 



Locus Name 



ketol-acid reductoisomerase 



E 



:AP12B563 



Acc# 
AF125563 



Description 



Neisseria meningitidis NMB putative aconitate hydratase (acnj , ornithine 
carbomyl trans f erase (argF) , and ketol-acidreductoisomerase (ilvC) genes, 
complete cds . 



ORF Name 
|3914002_i33TIT" 

Protein name 
Description 
[NfO-HIT 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



672 



] [2592 | |253 | |762 | 



Locus Name 



Probability 



Acc# 



214 



ORF Name 



NT ID AAID 



AA 

— Score 
Length Length 



3939063 r2 89 



Z7T 



2"5"9T 



NT 
n 



Z7T 



Probability 
3.2e-6I 



Protein name 

Description 
ADDING ENZYME) 



Locus Name 



sp:MURD_ECOLI 



Acc# 
P14900 



ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


3947713_ri_5 


574 2594 


260 | 


p" i 




|I.0e-57 | 


Protein name 






Locus 


Name 


Acc# 








sp : YAAA_hAEIN 


P43908 


Description 












HYPOTHETICAL 


PROTEIN HI0984 










ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


3954638J:2_63 


|675 | |2595 


| 438 


|13i7 | 


|i 0S 7 | 


|7.5e-108 | 



Protein name 



Locus Name 



sp:DCDA_PSEAE 



ACC# 
P19572 



Description 

DIAMINOPIMELA TE DECARBOXYLASE , (DAP DECARMOXVLA^E) 



ORF Name 



NTID AAID 



3962885 c3 219 



NT 
n 
TFT 



AA 

— Score 
Length Length 



[555" 



] [ 



Probability 
|2.8e-46 



Protein name 



Locus Name 



acetolactate syntnase, III small 
chain :acetohydroxy-acid synthase III small 
chain 



|pir:YCEC3H 



Description 



ORF Name 



3992035 ci 175 



NTID AAID 

]EHZ 



AA 

— Score 
Length Length 



NT 
an 



] [ 



Protein name 



Locus Name 



sp:PTA_HAEIN 



Acc# 

F64729:S14 
386:S40591 
:A01114:PS 



Probability 
|4.6e-127 

ACC# 
P45107 



Description 

PHOSPHATE AC E TYLTRANSFERASE , ( PH0SPH0 T RAN3ACET YLASE ) 



215 



ORF Name 



NT ID 



AAID 



4140881 ri 40 



NT 
n 

TS7 



AA 

— Score 
Length Length 

7T7T* 



T3W 



Probability 
|5.ie-70 



Protein name 



Locus Name 



NorM 



|gp:AB0i0453 



Acc# 
AB010463 



Description 

Vibrio paranaemoiyticus gene tor NorM, complete cas . 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4147562_ci_lb9 


| 579 | 2599 | 


523 | 


|1572 | |1209| 


|6.7e-123 


Protein name 






Locus Name 


ACC# 








sp:YIFB_HAEIN 


P45049 


Description 










HYPOTHETICAL 


PROTEIN HI1117 






i 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|447i887_t3_i35 


| 680 | |2SuT5 | 


415 | 


1248 |537 | 


|2.8e-52 



Protein name 



Locus Name 



FtsW 



|gp:AF123250 



Acc# 
AF123260 



Description 

Coxieiia burnetii FtsW utswj gene, complete cas. 



ORF Name NTID AAID 


NT 
Length 


AA 

— , Score Probability 
Length 


4720313_ci_180 | 581 | |250i 


1 ^ 1 


|450 | 151 |1.2e-10 


Protein name 




Locus Name Acc# 






|sp:METE_ECOLI j P2 5665 


Description 






(COBALAMIN- INDEPENDENT METHIONINE 


SYNTHASE) 


i 


ORF Name NTID AAID 


NT 
Length 


AA 

L — th Score Probability 


4770887_t2__72 | 582 2502 


175 


|531 | 130 | 2.7e-14 


Protein name 




Locus Name Acc# 


hypothetical protein 




, gp:SSU18930 Y18930 


Description 


Sultolobus soltataricus 281 Kb genomic DNA tragment, strain P2 . | 



216 



ORF Name NTID AAID 


NT 
Length 


, — . , Score 
Length 


Probability 


5100010 tl 7 | 583 2503 




1002 581 


2 .4e 




Protein name 




Locus Name 




Acc# 






sp:XERCJ»EIN 




P44818 


Description 










INTEGRASE/RECOMBINASE XERC 


ORF Name NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


5275250_c3_235 |584 | J2504 


| |202 


|609 | |328 | 


1.5e- 


-29 


Protein name 




Locus Name 




Acc# 


Trp repressor binding protein 




gp:AF067083 




AF067083 



Description 



Vitreoscilla sp. outer membrane protein nomolog gene, complete cds;Trp 
repressor binding protein gene, partial cds/ and unknown genes. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 




593802_£2_71 


i r i 


|2505 


i r i 


|1521 | |433 | 


l.le 


-40 


Protein name 










Locus Name 




Acc# 












sp : RBN_HAEIN 


P44608 


Description 
















RIBONUCLEASE BH, 


(RNASE BM) 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


5970933_tl_15 


| |686 


2505 


105 | 


|318 | 249 | 


3.6e- 


-21 


Protein name 










Locus Name 




ACC# 


unJcnown protein 




gp:MSGTCWPA 


M15467 



Description 



M . tuberculosis 65 kDa antigen (ceil wall protein a) 


gene . 




NT AA 

ORF Name NTID AAID . , — . , „ — . , 

Length Length 


Score 


Probability 


|5988327_c3_227 587 | 2607 | 927 | 2784 


J1588 | 


|4.Se-163 



Protein name 



Locus Name 



sp:HEPA_ECOLI 



Description 

RNA POLYMERASE ASSOCIATED PROTEIN (ATP -DEPENDENT HELICASE HE PA) 



Acc# 

P23852 :P75 
633 



217 



ORF Name 


NT ID AAID 


NT 
Length 


_ — . , Score 
Length 


Probability 


16488910 £2 94 


588 | 2608 


56 


1201 1 185 1 
1 III 


|0.0050 


Protein name 






Locus Name 




Acc# 








sp : YIHR_ECOLI 




P32139 


Description 












HYPOTHETICAL 34 


.0 KD PROTEIN IN GLNA-RBN INTERGEN1C REGION 






ORF Name 


NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|5516885_ci>_i8b 


| |689 | |2609 


1 1 1 


1642 1 1140 1 
1 1 1 1 


|3 .5e 




Protein name 






Locus Name 




ACC# 


putative membrane protein. 




| gp:SC6D7 




AL133213 


Description 












Streptomyces coeiicolor cosmid 6D7 


• 








ORF Name 


NTID AAID 


NT 
Length 


AA 

» , , score 
Length 


Probability 


665876_c:i_245 


|590 | 2610 


1 552 
1 


1659 215 1 
1 


rrie- 


-16 


Protein name 






Locus Name 




Acc# 








sp:0MPA_B0RAV 




Q05146 


Description 












OUTER MEMBRANE 


PROTEIN A PRECURSOR 










ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


8575_c3_236 


| 691 |2611 


| 140 


|423 | |343 


4 ,0e- 


-31 


Protein name 






Locus Name 




Acc# 








sp:CH10_PSEST 




033499 


Description 












10] 


ORF Name 


NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|859388_12_86 


692 | |2612 


ip 42 i 


1029 537 


l.le- 


-51 



Protein name Locus Name Acc# 

sp:ISPA_HAEIN I P45204 



Description 

I (FPP SYNTHASE) 



218 



ORF Name NTID AAID 


NT 
Length 


„ — . , Score 
Length 


Probability 


970625_cl_145 593 2613 


58 


|207 | |197 | 


l.le 

1 


-14 

1 


Protein name 




Locus Name 




Acc# 






sp : ILVI_ECOLI 




P00893 :P78 
045 


Description 








III) ( ACETOH YDROX Y - AC ID SYNTHASE 


III LARGE SUBUNIT) (ALS-III) 






ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


9957828_t2_88 | 694 | |2614 


1 I 149 1 


|450 |85 | 


10.00085 


Protein name 




Locus Name 




Acc# 


transposase 


1 gp : CETC2 


X59156 :S88 
451 


Description 








Caenorhabditis elegans transposon Tc2 . 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


21736625_c2_ll 695 | 2615 


290 | 


|870 j |508 j 


|1.3e- 


-48 j 


Protein name 




Locus Name 




ACC# 






sp:APAH_HAEIN 




P44751 


Description 










(DiADEnOSinE TETRAPHOSPHATASE) 


ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|26053825_il_4 696 | |2616 


1 P a 1 


|1587 | |1002 | 


|5.8e- 


-101 


Protein name 




Locus Name 




Acc# 






sp:DNAB_t!COLI 




P03005 


Description 










REPLICATIVE DNA HELICASE, 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . _ Score 
Length 


Probability 


33984677_i3_6 | 697 | |2617 


i r i 


(1146 | 555 | 


|i.4e- 


-53 


Protein name 




Locus Name 




Acc# 


biosynthetic alanine racemase 




gp:AP165882 




AF165882 



Description 



Pseudomonas aeruginosa biosyntnetic alanine racemase (air) gene, complete 
cds . 



219 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probabi 1 i ty 


[5181430 c3 14 


598 


2618 


327 


|984 | 


I 617 1 


|3.6e-60 


Protein name 








Locus 


Name 


Acc# 










sp : PDXA_ECOLI 


! P19624 


Description 














PYRIDOXAL PHOSPHATE 


BIOSYNTHETIC 


PROTEIN PDXA 




1 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|7580 c3 15 | 


699 


| J2619 


i ^ i 


888 J 


r i 


|l.le-58 


Protein name 








Locus 


Name 


Acc# 










sp : KSGA_ECOLI 


P06992 


Description 














D IMETHYLTRANS FERAS E J 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|10547156_Cl_79 


700 


|2620 


— | 1096 


3291 


|4176 | 


|0.0 | 



Protein name 



Locus Name 



carbamoyipnospnate synthetase large subunit I |gp : PAU81259 



Description 



Acc# 

U81259:L27 
528 



Pseudomonas aeruginosa dihydrodipicolmate reductase (dapB) gene, partial 
cds, carbamoylphosphate synthetase small subunit (carA) andcarbamoylphosphate 
synthetase large subunit (carB) genes, completecds, and FtsJ homolog (ftsJ) 
gene, partial cds. 



ORF Name 



NTID 



AAID 



1191b9b^ cl 76 



7UT" 



NT 
n 
TIE 



AA 

— Score 
Length Length 



Probability 
7 .6e-28 



Protein name 



Locus Name 



probable oxidoreductase 



bir:T35853 



Acc# 
T35853 



Description 

ORF Name 
|13947152J:2jr; 

Protein name 

Description 

NO-HIT 



NTID 



AAID 



AA 

— Score 
Length Length 



7uT" 



][ 



NT 
an 

] EE! 



] CO 



Locus Name 



Probability 



Acc# 



220 



ORF Name 



14094052 tl 20 



Protein name 



Description 



NTID AAID 



NT AA 
— — Score 
Length Length 



Probability 



TOT" 



TTT 



|335 | |150 | |9.7e-12 — 
Locus Name Acc# 



sp:YCCK_ECOLI 



P45572 :P75 
878 



HYPOTHETICAL 12.4 


KD PROTEIN IN HELD-SERT INTErGENIc 


REGION 


1 


ORF Name 


NT AA 

NTID AAID . _ „ _ 

Length Length 


Score 


Probability 


1660i077_t2_30 


704 2524 93 | 282 | 


I 1 " 1 


|l.le-05 | 



Protein name 



Locus Name 



hypotnetical protein 



bp:SSU18930 



Acc# 
Y18930 



Description 

SultoloJDus soitataricus 281 KJd genomic DNA rragment, strain P2 . 



ORF Name 



19562686 t3 66 



NTID 



AAID 



][ 



AA 

— Score 
Length Length 

STU- 



NT 
n 

1779" 



] i 



Probability 
|S.0e-5i 



Protein name 

Description 
TRANSCRIPTIONAL REGULATOR CBL 



Locus Name 



sp:CBL_ECOLI 



Acc# 

Q47083:P76 
353 



ORF Name 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



19571925 c3 117 



TOT" 



TTT 



] ED 



Protein name 
Description 
[NO-HIT — 



Locus Name 



Probability 



Acc# 



ORF Name 



NTID 



AAID 



203i332fi_t2_40 ] |707 | [ 



[T5TT" 



NT AA 
Length Length 
^5 



— , — , Score Probability 



] EZO 



Protein name 

Description 

[NO-HIT 



Locus Name 



Acc# 



221 



ORF Name 



NT ID 



AAID 



20488827 tl 10 



7W 



2628 



NT AA 
Length Length 

I 11677 



Score Probability 
11 538 | |2.3e-158 



Protein name 



Locus Name 



suitite reductase 



gp:AF02S066 



Acc# 
AF026066 



Description 

Pseudomonas aeruginosa suitite reductase tcysi; gene, complete cds. 



ORF Name 
|20953402_c2_103 



NT ID 
"J |709 



AAID 



][ 



NT AA 
Length Length 
TZ% — 



Score 



] [ 



[2TT 



Probability 
|3.2e-15 



Protein name 



Locus Name 



|sp:LPPB_HAEIN 



Acc# 
P44833 



Description 

OUTER MEMBRAN E ANTIGENIC LIPOPROTEIN B PRECURSOR 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


21907078_t3_55 


| 710 


| |2<S30 








Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


22766067_tl_2i 


1 1 711 


| |263i 


1 l 4VU 1 


|14iV | |96 | 


|0.012 | 


Protein name 








Locus Name 


Acc# 



|sp:THDP_MY«5E 



Description 

POSSIBLE THIOPHENE AND FURAN OXIDATION PROTEIN THDF 



P47254 :Q49 
330 



ORF Name 



24251510 c2 88 



NT ID 
|712 



AAID 



] |2£32 | 



NT AA 
Length Length 



Score 



Protein name 



Locus Name 



hypothetical protein Rv3629c 



E 



ir:F70561 



Probability 
|7.8e-74 

Acc# 
I F70561 



Description 



222 



ORF Name 



NTID 



AAID 



242S9bbS t2 39 



[7TT 



NT 
n 



AA 

— , Score 
Length Length 



Probability 
I.6e-34 



Protein name 



Locus Name 



similar to glutathione- s- transferase 



|gp:AF03G940 



Description 



Acc# 

AF036940:A 
F081362 



Pseudomonas sp. U2 piasmia pwwu^, terreaoxin reductase 
(nagAa) , salicylate- 5 -hydroxylase large oxygenase component 

(nagG) , salicylate- 5 -hydroxylase small oxygenase component (nagH) , f erredoxin 
(nagAb) , naphthalene dioxygenase large oxygenasecomponent (nagAc) , 
naphthalene dioxygenase small oxygenasecomponent (nagAd) , cis-naphthalene 



NT 

. ORF Name NTID AAID — L1 
Length 


AA 

— , Score 
Length 


Probability 


24801562_c2_82 | |714 | |2G34 | 554 


|1665 | |50£ 


8.0e-73 ] 


Protein name 


Locus Name 


ACC# 


permease tor AmpC beta- lactamase expression 


gp:AF082985 


AF082985 


Description 


Pseudomonas aeruginosa permease tor AmpC beta- lactamase express lonAmpG 
(ampG) gene, complete cds; and unknown gene. 


NT 

ORF Name NTID AAID , — , , 
Length 


AA 

— , Score 
Length 


Probability 


254484i2_c2_95 | 715 | |2S35 | |98 


|297 | |185 


i.7e-14 


Protein name 


Locus Name 


ACC# 


unknown 


gp:AF033858 


AF033858 


Description 


Pediococcus pentosaceus strain ATCC43200 plasmid pMDl36, completeplasmid 
sequence . 


ORF Name NTID AAID Jj^ 


AA 

. — . , Score 
Length 


Probability 


29429590_t2_44 | |7I6 | |2S36 | |i62 


|489 | 241 | 


2.5e-20 


Protein name 
Description 


Locus Name 
sp:VDL»JHELPY 


Acc# 
~~ | 005729 



PROTEIN VDLD 



223 



ORF Name 



|301«lb87 tl 9 



NTID 



AAID 



NT 
Length 
75 



AA 

T — fc u Score 
Length 



Probability 
0.037 



Protein name 



Locus Name 



bone morphogenetic protein 2 



pir:A61387 



Acc# 
A61387 



Description 



ORF Name 



3129637 c2 94 



NTID 



AAID 



NT AA 
Length Length 



Score Probability 



72T" 



Protein name 



[972 | |577 | |6.3e-56 — 
Locus Name Acc# 



mrr restriction system protein 



pir :F75508 



F75508 



Description 



ORF Name 



NTID 



AAID 



31375212 t3 S4 



NT 
Length 

EHZZI 



AA 

T — Score 
Length 



1848 



Probability 
[1278 | |3.3e-130 ~ 



Protein name 



Description 



Locus Name 



|sp7YFBQ_EC0LI 



Acc# 
P77727 



PROBABLE AMINOTRANSFERASE YFBQ, 




ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


32878_12_25 720 2540 


349 


1050 176 | 


1.2e 


I 


Protein name 




Locus Name 




Acc# 






|gp:ECOii0K: 






Description 








D10483 : J01 
597 : J01683 
: J01706 :K0 


E.coli K12 genome, 0-2.4min. region. 




ORF Name NTID AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


|33788387_c3_108 | 721 |264i | 




|201 | |87 | 


|0. 00053 


Protein name 




Locus Name 




ACC# 


nypotneticai protein spcp^ibig . 02 


|pir:T41692 




T41692 



Description 



224 



ORF Name 



NT ID 



AAID 



NT AA 
— — Score 
Length Length 



13915588 ci 78 



|722 




2542 




200 




503 


482 





Probability 
7.4e-46 



Protein name 

Description 
HYPO T HETICAL PkOTEIK HI0318 



Locus Name 



sp:Y3i8_HAEIH 



Acc# 
P43984 



ORF Name 
|40875_11__T5" 



NT ID AAID 



NT AA 
— — Score 
Length Length 



Protein name 



Description 



| 723 | [2543 | |125 | [381 | |189 | 

Locus Name 



sp:YHEN_EC0LI 



Probability 
|8.2e-l5 

Acc# 
P45532 



HYPOTHETICAL 13.5 


KD PROTEIN IN RPSL-FKPA IMTERGENIC UEU10N 




ORF Name 


NT AA 
NTID AAID _ — . , _ — . , Score 
Length Leny Lh 


Probability 


|4490902_c2_101 


| |724 (2544 |155 501 P^l 


4.3e-50 | 


Protein name 


Locus Name 


Acc# 



Description 



sp :GREA_ECOLI 



P21346 :P78 
111 



GREAT 



ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



14535658 c3 107 



725 


2545 940 




2823 1554 



Probability 
i.9e-159 



Protein name 

Description 
SYNTHETASE ADENYLYLTHANSFERASE) (ATASH) 



Locus Name 



sp:GLNE_ECOLI 



Acc# 

P30870 :P78 
107 



225 



ORF Name 



NT ID 



AAID 



4572125 cl 72 



NT 
n 
TIT 



AA 

— Score 
Length Length 



Probability 
l.le-Bl 



Protein name 



Locus Name 



unknown 



PAU63 816 



ACC# 
U63816 



Description 



Pseudomonas aeruginosa glnE gene, partial cds; xlvE , ADP-heptose : LPS 
heptosyltransf erase homolog (waaF) , lipopolysaccharide heptosyltransf erase I 
homolog (waaC) , glucosyltransf erase I homolog (waaG) , RfaP protein (waaP) , 
andunknown protein (waaX) genes, complete cds; and inaA gene, partialcds. 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|50B4827_i3_48 


727 


2647 


163 | 


r 2 i 


lH'i 


|l.be-14 | 



Protein name 



Locus Name 



Acc# 



sp : SURA_ECOLI 



Description 








1 P21202:P75 

630 


SURA) , (PPIASE) 


(ROTAMASE C) 








ORF Name 


NTID AAID 


NT 
Length 


AA 

, — , , Score 
Length 


Probability 


|S287S55_c2_i02 


728 2648 


125 


378 |i25 | 


|5.0e-08 


Protein name 






Locus Name 


Acc# 



Description 



sp:VC53_HAEIN 



P44139 



HYPOTHETICAL 


PROTEIN HI1253 






i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|5322062_ci_69 


|729 |2649 


265 


|798 | |449 | 


2.3e-42 | 


Protein name 






Locus Name 


ACC# 



sp:M0EB_EC0LI 



P12282 



Description 
M0LYBD0PTERIN BIOSYNTHESIS MOEB PROTEIN 



226 



ORF Name 



582220a c2 M 



Protein name 



Description 



NTID 



7T0~ 



AAID 



NT 
n 



AA 

— Score 
Length Length 



Probability 
5.5e-48 



Locus Name 



sp:HEMK_ECOLI 



Acc# 

P37186:Q46 
754 



HEMK PROTEIN 


ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


5829537_r2_38 


| |73i 2651 


455 


1358 


525 


6.6e-93 | 



Protein name 



Description 



Locus Name 



sp:CV3G_EC0LT 



Acc# 

P11098 :P76 
685 



ORF Name 



7 11 7 182 c3 114 



Protein name 



unknown 



Description 



NTID 



TTT 



AAID 



NT 



AA 



— — , Score Probability 

Length Length 

TIE 1 |348 | [200 | |5.5e-16 

Locus Name Acc# 



|gp:AF033858 



AF033858 



Pediococcus pentosaceus strain ATCC43200 plasmid pMDl36, completeplasmid 
sequence . 



ORF Name 



NTID 



AAID 



Score 



7281552 cl 77 



TIT 



NT AA 
Length Length 

] [1251 | fTJT 



TIE 



Protein name 



Description 



Locus Name 



sp : CARA_PSEAE 



Probability 
|5.4e-130 

Acc# 
P38098 



PHOSPHATE SYNTHETASE GLUTAHINK CHAIN) | 


NT 

ORF Name NTID AAID _ — . , 

Length 


AA 

. — . , Score 
Length 


Probability 


953181_c3_105 |734 2654 |450 


|1353 | 449 | 


2.3e-42 | 


Protein name 


Locus Name 


Acc# 



sp:Y16S_MVCTU 



P96936 



Description 
HYPOTHETICAL 54.8 KD PROTLilN CY20H10 . 2«C 



227 



ORF Name 



NT ID 



AAID 



735 



NT AA 
Length Length 
TTJ2 — 



Score Probability 



TUT 



Protein name 
Description 
NO-HIT 



Locus Name 



Acc# 



ORF Name 



NT ID 



AAID 



16985642 C2 43 



NT AA 

— — Score 

Length Length 

] E 



EEZI I 



TIT 



Probability 
[9 . 7e-U8 



Protein name 

Description 
GLUTAMATE RACEMASE, 



Locus Name 



sp:MURI_5YNY3 



ACC# 
P73737 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


189186_c3_54 


737 


| 2657 


i p» 


|702 | 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


|23628411_tl_6 


pa 


2658 


|295 


|888 | |379 | 


|6.ie-35 


Protein name 








Locus Name 


Acc# 



Description 



sp:YCHB_ECOLl 



P24209 



HYPOTHETICAL 30 


9 KD PROTEIN IM HEMM-PRSA INTERGENIC REGION 




ORF Name 


NT AA 
NTID AAID .. — L , . — . , Score 
Length Length 


Probability 


29494003_t2_13 


739 | 2659 | 672 2019 | 241 | 


|6.6e-17 | 



Protein name 

Description 
HYPOTHETICAL 64. 



Locus Name 



sp:YHE3_PSEAE 



Acc# 
P42810 



8 KB PRO TE IN I N HEMM-H E MA INT E RGENIC REGION (0R F 3) 



228 



ORF Name NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|29960761 cl 30 | |740 


1 12650 
1 1 


| 468 
1 


1407 |73i | 


|3 . Oe 




Protein name 








Locus Name 




Acc# 










sp:HEMl_PASMU 


P95525 


Description 














GLUTAMYL- TRNA REDUCTASE, 


(GLUTR) 










1 


ORF Name NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|31291251_r2_15 | |74I 


| |266i 


|265 


|798 | |924 | 


l.le- 


1 


Protein name 








Locus Name 




Acc# 



Description 



gp :ECOPRS 



M13174 



E.coli prs gene encoding phosphonfcosyipyrophosphate syntnetase , complete 
cds . 



ORF Name 



134641308 13 23 



Protein name 



NTID AAID 



NT 



AA 



— , _ — , , Score 



Length Length 



Probability 



T5T 



|196 | |59i | |160 | |9.7e-12 — 
Locus Name Acc# 



sp:LOLB_PSEAE 



P42812 



Description 
OUTER MEMBRANE LIPOPROTEIN LOLB PRECURSOR 



ORF Name 



NTID 



AAID 



AA 

— , Score 
Length Length 



4869025 t3 19 



12663 



NT 
n 



Probability 



1665 | [1876 | |1.4e-193 



Protein name 



Locus Name 



sp:ETFD_ACICA 



ACC# 
P94132 



Description 

DEHYDROGENASE ) ( ELECTRON - TRANS FERR IN G - FLAVOPROTE IN DEHYDROGENASE) 



ORF Name 



110552153 tl 31 



NTID 



AAID 



NT AA 
— — Score 

Length Length 



] [2664 | |74 | [225 | pg~ | [ 



Probability 
7.1e-12 



Protein name 

Description 
Mus muscuius P4(2l)n mRNA, partial cds. 



Locus Name 



E 



:AB028i568 



Acc# 
AB028868 



229 



ORF Name 



NT ID AAID 



110722125 12 55 



2665 



NT AA 
Length Length 
T5T 1 — 



Score 



T5"5~ 



Probability 
2 . 9e-12 



Protein name 



Description 



Locus Name 



sp : SMPA_ECOLI 



Acc# 
"I P23089 



SMALL PROTEIN A 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


il7037_i3_138 


745 


|2£>56 


218 | 


p7 1 375 


|1.6e-34 


Protein name 








Locus Name 


Acc# 



sp:Y787_HAEIN 



P44052 



Description 
HYPOTHETICAL PROTEIN HI0787 



ORF Name 



111885327 11 51 



NTID AAID 

] 



— , — , Score Probability 



7TT 



NT AA 
Length Length 

□ EE 



7S 



] 



Protein name 



Description 



Locus Name 



Acc# 



[NO-HIT 


ORF Name NTID AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|12297203_c3_2!}6 748 | |2668 


1 P y 1 


|237 | pa | 


p.ie-09 


Protein name 




Locus Name 


Acc# 


nypotnetical protein APE2061 




] pir:G72510 


G72510 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|1250663b_12_83 749 | 2659 




poi | 




Protein name 




Locus Name 


Acc# 



Description 
[NO-HIT ~~ 



230 



ORF Name 


NT AA 
NT ID AAID _ — . , , — . , Score 
Length Length 


Probability 


112697037 'c2" 20T 


1 75TJ 12570 111 1336 1 199 1 




Protein name 


Locus Name 


Acc# 




|gp : PADLDH 


""j X70925 


Description 






| P.acianactici 


gene tor a-iactate dehydrogenase. 


i 


ORF Name 


NT AA 
NT ID AAID . — , . — , Score 
Length Length 


Probability 


|i30784i6_c3_233 


| 751 2671 | |167 | (504 | |210 | 


|4.9e-17 



Protein name 



Locus Name 



rifcosomal -protein- serine 
N-acetyltransf erase, rimL homolog ydaF 



Description 



pir :F69768 



Acc# 
F69768 



ORF Name 



NT ID AAID 



1134635 C2 226 



7S2" 



NT 
n 
TT7 



AA 

— Score 
Length Length 



[444 | 



1TT9" 



Protein name 



Locus Name 



terric uptake regulator 



: AUDNAFUR 



Probability 
|6.7e-52 

Acc# 
Y14980 



Description 



Acmetobacter £>aumannn tur gene. 


ORF Name NT ID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


13726003_c2_196 | 753 2673 


349 


J1050 736 


|8.9e 


I 


Protein name 






Locus Name 




Acc# 


iron transport protein : protein 
slrl295 rprotein slrl295 




pir :S74691 


S74691 










Description 












ORF Name NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|13870462_t3_144 754 | 2674 


i ui 


P 6 | ,117 | 


3 . 5e- 


-07 | 


Protein name 






Locus Name 




Acc# 


hypothetical protein jhpii63 




pir :B71840 


B71840 



Description 



231 



ORF Name NTID 


AAID 


NT 
Length 


AA 

, — , Score 
Length 


Probability 


ft.4553432 t'2 57 755 


[2575 


1 1 


|912 | |647 | 


|2 .4e 


-63 


Protein name 








Locus Name 




Acc# 










sp : METR_SALTY 


P05984 


Description 














TRANSCRIPTIONAL ACTIVATOR 


PROTEIN 


METR 








1 


ORF Name NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|15820312_c2_223 | |755 


| p*7t 


i r v i 


|2604 | |1893 | 


|2.2e 


-195 | 


Protein name 








Locus Name 




Acc# 


UspA2 




gp:AF113611 


AF113611 



Description 

Moraxeiia catarrnaiis strain V1171 uspA2 (uspA2) gene, compietecas. 



ORF Name 



NTID 



AAID 



15016926 E3 137 



NT 
Length 
2T7 



— — Score Probability 



Protein name 



AA 
Length 

[7T4 1 |197 | |3.ie-15 — 

Locus Name Acc# 



growtn tactor-responsive protein, vascular 
smooth muscle :SM- 2 0 



foir:A53770 



A53770 



Description 



ORF Name 



NTID 



AAID 



16064061 c2 1H7 



7SF 



NT 
Length 
\T51 1 



Protein name 

Description 
(EC 2.4.1.-) 



AA 

, — L1 Score Probability 

Length 

[1194 | |738 | |5.5e-73 — 

Locus Name Acc# 



sp:MURC_HAEIN 



P45065 



ORF Name 



NTID AAID 



16171905 c2 202 



T5T 



][ 



NT 
Length 
73 



— , Score Probability 



AA 
Length 



J 



Protein name 
Description 
[NO-HIT 



Locus Name 



ACC# 



232 



ORF Name 


NT ID 


AAID 


NT 
Length 


- — . , Score 
Length 


Probability 


16585933 ti 11 


750 


12680 1 

r 1 


765 


2298 |2390 | 


|4.8e-248 
1 


Protein name 








Locus Name 


ACC# 










sp:IDH_A20Vl 


| P16100 


Description 












DECARBOXYLASE) 


(IDH) 










ORF Name 


NT ID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


I6678186_c3_25i 


761 


£t O O A. 


|488 | 


11457 "| 11057 1 


18 6e-107 


Protein name 








Locus Name 


Acc# 


nypotnetical protein vszdv 


.4 




p±r :T21659 


T21659 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


2008005i_ci_17i 


| 762 


|2682 | 


299 | 


900 | |330 | 


|9.4e-30 


Protein name 








Locus Name 


Acc# 










sp:YJJV_ECOLI 


1 P39408:P78 


Description 










143 


HYPOTHETICAL 28 


.9 KD PROTEIN IN OSMY 


^DEOC INTERGENIC REGION 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


20509628_c2_204 


763 


| 2683 


378 | 


|1137 | |891 | 


|3.4e-89 



Protein name 



Description 



Locus Name 



sp:FT3Z_EC0LI 



ACC# 

P06138 :P78 
047 :P77857 



CELL DIVISION PROTEIN FTSZ 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


2i32006_c2_2I8 | 764 | |2684 


1 m 1 


399 153 


2.4e-10 


Protein name 




Locus Name 


Acc# 


nypotnetical protein siiibju 




pir:S75232 


| S75232 



Description 



233 



ORF Name 



NTID 



AAID 



1 22048442 t2 95 



7F5~ 



NT AA 

— — Score 

Length Length 

TU2 1 11209 



Probability 
|8.2e-102 



Protein name 
Description 

5UCCINYL-DI AMINOS 1M E LATE DE^UCCINYLAgE , (SDAP) 



Locus Name 



sp : DAPE_HAETN 



Acc# 
P44514 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|22323956J:3J.30 | 




|268G 


1 41 1 


|185 | |109 | 


|6.5e 


-06 | 


Protein name 










Locus Name 




Acc# 


nypotnetical protein PH0221 




pir:D71245 


D71245 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


|224633ii_t3_128 | 




|2687 


1 I 103 1 


312 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT I 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length ■ - 


Probability 


|22734807_rl_40 | 


P « 


|2688 


97 


|294 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT j 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|22930306_cI_I70 


759 


|2689 


354 


(1055 |427 


5.0e- 


-40 



Protein name 



Locus Name 



5 ' -nucleotidase 



|gp:CL1131^43 



Acc# 
AJ131243 



Description 
ColumjDa livia mRNA tor 5' 



-nucleotidase. 



234 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


123145135" 12 50 1 


|770 


|2690 


583 | 


1752 752 


4 .3e 


-120 


Protein name 








Locus Name 




Acc# 


NH(3) -dependent JMAD 


(+) synthetase 




pir :G72277 




G72277 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|23532300_cl_165 | 


|771 


| |2691 


i p i 


p" i 






Protein name 








Locus Name 




Acc# 


Description 














[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|23556625_C!3_24b j 


|772 


| 2692 


i r 1 i 


|726 | |1U5 | 


2 .2e- 


1 


Protein name 








Locus Name 




Acc# 










sp:FTSQ_ECOLI 




P06136 


Description 














CELL DIVISION PROTEIN FTSQ 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|23615832_tl_15 | 


773 


2693 


|334 


1005 125 


|9.1e- 


-09 | 


Protein name 








Locus Name 




ACC# 


lysopnospnolipase nomolog 






pir:T02661 




T02661 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|24111015_c3_259 | 


|774 


j |2694 


i i ib4 i 


|Ub | pV4 , 


8.1e- 


-24 


Protein name 








Locus Name 




ACC# 



Description 



sp:RIBP__ECOLI 



P08391:P75 
621 



235 



ORF Name 



I2421905G cl 184 



Protein name 



Description 



NTID AAID 



77F" 



NT 
n 



AA 

— Score 
Length Length 



7TT 



5TT 



Probability 
4.9e-49 



Locus Name 



sp:YPT5_PSEAE 



Acc# 
P24562 



HYPOTHETICAL 24 . b 


KD PROTEIN IN PILT 5 1 REGION (0RP5) 




ORF Name 


NT 

NTID AAID _ . , 

Length 


AA 

. — . , Score 
Length 


Probability 


|242207B6_tI_4 


| |776 | |2S96 | |370 | 


|1113 | |835 | 


|2.9e-83 


Protein name 




Locus Name 


Acc# 



Description 



sp:PILT_PSEAE 



P24559 



TWITCHING MOBILITY 


PROTEIN 






1 


ORF Name 


NTID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|24250928_ci_i53 


777 2597 


78 


237 | [207 | 


|i.0e-i<5 


Protein name 






Locus Name 


ACC# 



sp:VPHJ_ECOLI 



P37096 



Description 

HYPOTHETICAL 7 . 7 KB PROTEIN IN PPMH-VDX INTERGENK? RBQIOM 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|24255260_c2_229 


|778 


| |2698 


115 


i 348 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


|24412562_c2_189 


| |779 


| |2699 


pi4 | 


|945 | |740 | 


|3.4e-73 


Protein name 








Locus Name 


Acc# 



sp:DDL_HAEIN 



P44405 



Description 

D- ALANINE- -D- ALANINE LIGASE, — (D-ALANYLALANINE SYNTHETASE) 



236 



ORF Name 



|2441b875 cl 174 



Protein name 



Description 



NTID 



AAID 



T7W 



NT 
n 
TZ2 



AA 

— Score 
Length Length 



TOT- 



Locus Name 



sp :HEMZ_ECOLI 



Probability 
6.0e-83 



Acc# 

P23871:P78 
232 



SYNTHETASE) 












ORF Name NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


24S44577_t2_95 | |78i 


| |270i 


bib | 


|1608 | 


923 


| |i.4e-92 


Protein name 






Locus Name 


Acc# 


nypotnetical protein 






pir:S76051 


S76051 


Description 












ORF Name NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


24813161_13_143 |732 | 2702 


344 


1035 


650 


| |1.2e-63 | 


Protein name 






Locus Name 


Acc# 


MsmX 






|gp:AB013374 


AB013374 


Description 












Bacillus nalodurans C-125 
complete cds . 


mamx, yjciA, ykoK ana yvtK genes, 


partialand 




ORF Name NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


25579510_i3_i09 |783 


2703 


156 


I 471 1 


99 


0.0018 | 


Protein name 






Locus Name 


Acc# 


myosin alpna heavy chain, 


masticatory muscle 


] pir:S33732 


S33732 


Description 












ORF Name NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


26212 750_C2_205 784 


2704 


325 


990 


285 


|7.8e-25 | 


Protein name 






Locus Name 


Acc# 



Description 



|gp:ATAC006436 



AC006436 



Arabidopsis thaliana chromosome II BAC F13J11 genomic sequence, complete 
sequence . 



237 



ORF Name 



NT ID AAID 



NT AA 

— , — , Score Probability 
Length Length 



26578567_cl_l<!>4 | |785 | |2705 


53 


192 88 | 


0.00042 


Protein name 






Locus Name 


Acc# 


Hypothetical protein 2 9.1 


pir :S59084 


S59084 


Description 










ORF Name NTID AAID 


NT 
Length 


AA 

— , Score 
Length ■■ ■ 


Probability 


|26BI3i35_c2_192 | |786 | |2706 


1 P 4 1 


|I575 | |1287| 


|3.7e-m 1 


Protein name 






Locus Name 


Acc# 


alkyl hydroperoxiae reductase, mz* protein 




pir:D64794 


D64794 


Description 










ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


273427_c^_263 787 | 2707 


F 4 | 


|575 | 583 


1.5e-56 | 


Protein name 






Locus Name 


Acc# 








sp:DEDA_E(L 4 OLI 


P09548 


Description 










DEDA PROTEIN (DSG-1 PROTEIN) | 


ORF Name NTID AAID 


NT 
Length 


AA 

Length 


Probability 


|29337825_t2_J>2 | 788 2708 


57 


204 131 


i.2e-08 


Protein name 






Locus Name 


Acc# 



Description 



sp:YPTi_PSEAE 



P24560 



glutatnione synthetase 



D88540 



D88540 



Description 

Synechococcus sp. DNa tor glutatnione synthetase, complete cas. 



| HYPOTHETICAL 17.0 KD PROTEIN IN P1LT 


5 1 REGION 10RF1) 


i 


ORF Name NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|31252514_c3_231 |789 2709 


69 i 


|207 | |S5 | 


|0.0054 | 


Protein name 




Locus Name 


Acc# 



238 



ORF Name 



NT ID AAID 



NT 



AA 



Length Length 



Score Probability 



13236505 c2 217 
1 ~ ~ . 


790 


|2710 


119 


360 | 127 


1.7e 


-07 | 


Protein name 










Locus Name 




Acc# 


hypotnetical prote 


in S111830 






pir:S75232 


S75232 


'n^c!(- , T~"i "nt~ "i on 
















ORF Name 


NT ID 


AAID 


NT 
Length 


. — . , Score 
Length 


Probability 


|32593750_il_27 




| |271i 


1 P" 1 


|1101 | |1382 | 


p.le 


-141 | 


Protein name 










Locus Name 




Acc# 












sp:RECA_ACICA 


P42438 


Description 
















Rt!CA PROTEIN 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


33788286_c2_188 


752 


| 2712 


i 493 


|1482 |1448 | 


|3.2e 


-148 | 



Protein name 



Locus Name 



UDP-N-acetylmuramate :L-alanine ligase Mure 



AF110740 



Acc# 
AF110740 



Description 



Pseudomonas aeruginosa UDP-N-acetylmuramate : L- alanine ligase MurC(murC) 
gene, complete cds. 



ORF Name 



NTID 



AAID 



34040777 ci 169 



T5T 



TTTT 



NT AA 
Length Length 
1 f^I 



Score 



Protein name 



Description 



Locus Name 



sp : PPDD_ECOLI 



Probability 

o.oooii 

Acc# 
P36647 



PREPILIN PEPTIDASE DEPENDENT PROTEIN D PRECURSOR 


NT 

ORF Name NTID AAID — ^ 

Length 


AA 

. — . , Score 
Length 


Probability 


34159412_ti_53 | 794 | 2714 |330 


993 |497 | 


ji.9e-47 


Protein name 


Locus Name 


Acc# 


oxidative stress transcriptional regulator 


gp:XCU94336 


1 U94336 


Description 


xanthomonas campestris alkyi Hydroperoxide 
(ahpF) and oxidative stress transcriptional 
cds . 


reductase summits cianpo and F 
regulator (oxyR) genes, complete 



239 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


135275851 ±3 T32 


795 


2715 


80 


243 88 


0.00042 


Protein name 










Locus Name 




Acc# 


hypothetical prote 


in 








pir:D75542 


D75542 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


3907:*il_r:*_il0 


1 1™ 


| |2716 


1 1 74 i 


p » 1 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


3928750_c3_242 


| 797 


| 2717 


| 298 | 


|897 | 412 


|1.9e- 


-38 


Protein name 










Locus Name 




ACC# 



Description 



sp:YHIR_HAE!N 



P31777 



HYPOTHETICAL 


PROTEIN HI0441 (OttKJJ 








ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


3940943_t2_b9 


798 |2718 


I 118 1 


|ibV | |W | 


|0. 00012 


Protein name 






Locus Name 


ACC# 



Description 



sp:YSPE_ECOLI 



P45580 



HYPOTHETICAL 12 


.6 KD PROTEIN IM PEPP-SSR InTERgENic 


REGION 


(0109) 


ORF Name 


NT AA 
NTID AAID _ — . , _ — 

Length Length 


Score 


Probability 


3947318_t3_116 


| |799 j 271S | 276 | |831 


| |1038 


| |8.9e-105 | 



Protein name 



Locus Name 



sp : DAPD_MYCBO 



ACC# 
P56220 



Description 

( T HP SUCCIMVLTKANSPEBASE) ( T ETRAH YDRO P I COL 1 NATE 5UCC I NYLA5 E ) 



240 



ORF Name 



NT ID AAID 



14017832 Tl 85 



800 



12720 



NT AA 
Length Length 
2TI — 



Score 



5T5~ 



Probability 
2.0e-50 



Protein name 



Locus Name 



DedA tamily protein 



Description 

ORF Name 
|4023342_cl_186 



Protein name 

Description 
I (F194) 



tpxr:B75253 



Acc# 
B75253 



NT ID 

]EEn 



AAID 



NT AA 
— — Score 
Length Length 



][ 



\ztzt 



] [ 



Probability 
] |145 | |J.8e-iO 



Locus Name 



jsp : YGFB^Erorr" 



Acc# 
P25533 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|402336J:2_89 


| 802 


] |2722 


83 


252 




Protein name 








Locus Name 


Acc# 


Description 












piO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4140943_t2_73 


|803 


| 2723 


j 157 


|474 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|4331430_tl_28 


ir 4 


| (2724 


1 1 3 " i 


|942 | |124 | 


|2.4e-16 


Protein name 








Locus Name 


Acc# 



sp:RECX_VIBCH 



Q56647 



Description 
REGULATORY PROTEIN RECX 



241 



ORF Name 
j4332B37_c3_2B6 



NT ID AAID 



NT AA 
Length Length 
Tu"5 — 



Score 



TUT 



Im- 



probability 
6.7e-10 



Protein name 



Locus Name 



hypothetical protein S111830 



fpir:S75232 



ACC# 
S75232 



Description 

ORF Name 
|4348813_c3_2b9 
Protein name 

Description 



NTID AAID 



NT AA 
Length Length 







10 " 


1 



Locus Name 



sp:GCP_HAEIN 



Probability 
[l.fle-104 

ACC# 
P43764 



| (GLYC0PR0TEA5E) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4694427_t3_ii4 


| 807 


| |2727 


1 1 3 " i 


|1089 |884 | 


|1.9e-88 


Protein name 








Locus Name 


ACC# 



sp:LIPA_HAEIN 



P44463 



Description 

LIPOIC ACID SYNTHETASE (LIP-5YN) (LIPOATE SYNTHASE) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4773260J:2_79 


| [80S 


| |2728 


i r i 


p« i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|479853S_c2_191 


| 809 


| |2729 


| 2T2 | 


|639 | 685 | 


|2.3e-57 | 



Protein name 



Locus Name 



alkyl hydroperoxide reductase subunit C 



gp:AF129406 



Acc# 
AF129406 



Description 



Bacteroides tragilis alkyl hydroperoxide reductase subunit C (ahpC)and alJcyl 
hydroperoxide reductase subunit F (ahpF) genes, completecds . 



242 



/ 



ORF Name 



14824062 c2 200 



Protein name 



Description 



NT ID AAID 



STCT 



T7Tu~ 



NT AA 
Length Length 
T71 



Score 



HIT 



T5T 



Locus Name 



sp:PSS_HELPY 



Probability 
3 . 4e-32 



Acc# 

Q48269:O07 
681 



(PH03PHATIDYL5ER1NE SYNTHASE) 


ORF Name NTID AAID 


NT 
Length 


. — . , Score 
Length 


Probability 


6ii0943_c2_203 | |8ii |273i 




1308 |303 | 


|3.5e-26 | 


Protein name 




Locus Name 


Acc# 






sp:FTSA_BUCAP 


051928 


Description 








CELL DIVISION PROTEIN FTSA 


ORF Name NTID AAID 


NT 
Length 


AA 

r — . i Score 
Length - - 


Probability 


S59686_c2_210 812 2752 


278 


837 | 485 


3.5e-4S 



Protein name 



Description 



Locus Name 



sp:YGDL_HAEIN 



Acc# 

Q57097:O05 
009 



HYPOTHETICAL PROTEIN HI0118 


ORF Name NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


6754081_c3_235 813 2733 


|254 


|765 | 169 | 


1. le- 


-12 


Protein name 






Locus Name 




Acc# 


hypotnetical protein MTH913 9 




pir:069225 


G69225 


Description 












ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


581453Jb3_127 |814 2734 


|84 


pbb | |Ve | 


|0.038 | 



Protein name 



Locus Name 



|sp:YXEH_BACSU 



ACC# 
P54947 



Description 

HYPOTHETICAL 30.2 KB PROTEIN IN 1DH-DE0R 1NTERG ENIC REGION 



243 



ORF Name 



NTID AAID 



I682641 tl 42 



T7JT 



NT AA 
Length Length 
FS 1 IZST — 



Score 

ED 



Protein name 



Locus Name 



Probability 
|2.2e-05 

Acc# 



hypothetical protein PH0217 



fpir:G71244 



G71244 



Description 

ORF Name 
|72221S7_tl3?5" 



NTID AAID 



NT AA 
Length Length 

] | 816 | p 736 | |245 | |738 



Score 
1287 J 



1 1 



Protein name 



Locus Name 



conservea nypotnetical protein ykrA 



|pir:«9862 



Probability 
| 3.4e-2b 

Acc# 
C69862 



Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|781563_c2_227 


|817 


| |2737 | 


113 | 


|342 | 255 | 


|7.3e-23 | 


Protein name 








Locus Name 


ACC# 










sp:YPTS_PSEAE 


P24564 


Description 












HYPOTHETICAL 19 


5 KD PROTEIN IN PILT 


REGION 


(ORES) 


1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|812535_rl_43 




| |2738 


77 


234 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT j 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|10625252_tl_3 


||81 S 


| |2739 | 


r i 


|1746 | |1790 | 


|1.8e-184 1 



Protein name 



Locus Name 



sp:SYP_HAEIN 



Acc# 
P43830 



Description 

PROLYL -TRNA SYNTHETASE, (PROLINE- -TRNA LIGASE) (PRORS) 



244 



ORF Name 


■Villi T T"^ T\ TV -r r\ 

NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


20495062 c2 30 


820 2740 


408 


1227 |1522 | 


|4.6e 


-156 


Protein name 








Locus Name 




Acc# 










sp:TRPB_ACICA 


P16706 


Description 














TRYPTOPHAN SYNTHASE 


BETA CHAIN, 










i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


22847180 cl 17 


821 |2741 


1 213 1 
1 1 


(642 | 568 


5.7e 


-55 


Protein name 








Locus Name 




Acc# 










sp : YADG_ECOLI 


P36879 


Description 














HYPOTHETICAL ABC TRANSPORTER ATP- 


BINDING PROTEIN YADG 




i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24642268_c3_37 


822 | [2742 


|213 


642 |510 | 


|7.9e 


-49 


Protein name 








Locus Name 




Acc# 










sp:TRPF_ACICA 


P16923 


Description 














N- (5 ' -PHOSPHORIBOSYL) ANTHRANILATE 


I50MERASE, 


(PRAI) 




i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24814125_c2_3i | 


|823 | 2743 


i m i 


|858 | |54i | 


4.1e- 


i 


Protein name 








Locus Name 




ACC# 


tryptophan synthase 


alpha chain 




Jgp:AP107054 


AF107094 



Description 



Rhodobacter sphaeroides thiamine biosynthetic protein (thi) gene, partial 
cds; and tryptophan synthase alpha chain (trpA) gene, complete cds . 



ORF Name 
|30727194_c2_24 

Protein name 
Description 
INO-HIT 



NTID AAID 



NT 



AA 



— — Score 
Length Length 



\rwr 



ED 



Locus Name 



Probability 



Acc# 



245 



ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


3941542 c2 25 


j |825 2745 | 


100 


303 224 1 


1.6e 


1 


Protein name 








Locus Name 




Acc# 










sp:YADG_ECOLI 


P36879 


Description 














HYPOTHETICAL 


ABC TRANSPORTER ATP -BINDING PROTEIN YADG 




1 


ORF Name 


IN X A.LJ AnlL/ 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


[5181557 cl 19 


| 826 [ 2746 


281 


|846 | |553 


|2.2e 


1 


Protein name 








Locus Name 




ACC# 










sp:YQCD_ECOLI 


Q46920 


Description 














1 HYPOTHETICAL 


32.6 KD PROTEIN IN SYD- 


•SDAC INTERGENIC REGION 




1 


ORF Name 


NTID AAID 


NT 
Length 


AA 

T — ■ i Score 
Length 


Probability 


|4426338_c2_26 


827 2747 


260 | 


|783 | |744 | 


|1.3e 


1 


Protein name 








Locus Name 




ACC# 










sp:YADH_ECOLI 


P36880 :P75 
657 














HYPOTHETICAL 


28.5 KD PROTEIN IN HPT- 


•PAND INTERGENIC REGION 




1 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


5120412_c3_32 


828 2748 


186 


561 |196 | 


|1.5e- 


-15 


Protein name 








Locus Name 




Acc# 


cytochrome c5 




gp:AVU94420 


U94420 



Description 



Azotobacter vmeiandu aldehyde dehydrogenase (aidnj 
cytochrome c5 (cycB) gene, complete cds, and 
xanthinephosphoribosyltransf erase- like protein (xrpt) 


gene, 
gene, 


partialcds, 
partial cds . 


NT AA 

ORF Name NTID AAID . — ^ — ^ 

Length Length 


Score 


Probability 


|7031312_c3_J33 829 2749 66 |201 | 


129 


7.Se-08 



Protein name 



Locus Name 



sp:YADS_ECOLI 



Acc# 
P36879 



Description 

HYPOTHETICAL ABC TRANSPORTER A T P-BINDINC PROTEIN YADG 



246 



ORF Name 



NT ID AAID 



111198430 c3 60 



NT AA 

— , — , Score 

Length Length 

m 1 12079 



13292 



Probability 
UT75 



Protein name 



Locus Name 



lactotemn binding protein B 



:AF043m 



Acc# 
AF043131 



Description 



Moraxella catarrhalis strain 4223 lactotemn binding protein B(lbpB) and 
lactoferrin binding protein A (lbpA) genes, completecds ; and unknown genes. 


NT AA 

ORF Name NTID AAID , — , — . , 

Length Length 


Score Probability 


|16128933_c2_53 831 |2751 159 |480 


|377 | |9.9e-35 | 



Protein name 



Locus Name 



apolipoprotem N-acyltransterase 



|gp:AP038595 



Acc# 
AF038595 



Description 



Pseudomonas aeruginosa apolipoprotem N-acyltransterase (cutE) gene, complete 
cds . 



ORF Name 



NTID 



AAID 



19704378 c3 64 



NT AA 

— — Score 

Length Length 

^ i 



Protein name 



Locus Name 



Probability 
|8.3e-125 

Acc# 



unknown 



AF043132 



AF043132 



Description 



Moraxella catarrhalis strain Q8 lactoterrin binding protein B(lbpB) ancl 
lactoferrin binding protein A (lbpA) genes, completecds; and unknown genes. 



ORF Name 



NTID 



AAID 



124337826 c3 67 



[275T 



NT 

sn 
ITS 



AA 

— Score 
Length Length 



Probability 



Protein name 



[2157 | p7B~| |2.9e-21 ~ 
Locus Name Acc# 



hypothetical protein K08H10.2a 



5ir :T23512 



Description 



T23512 :T24 
613 



247 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

• — . , Score 
Length 


Probability 


34164812_c2_54 


| 834 | 2754 


|198 


597 |712 


|l.&e-80 


Protein name 






Locus Name 


Acc# 


lactotemn binding protein B 




gp:AF043133 


AF043133 


Description 


Moraxella catarrhalis strain VHiy lactotemn binding protein BUDpBJ gene, 
complete cds. 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


35837503_i2_2S 


| 835 (2755 


i 41 


|186 




Protein name 






Locus Name 


Acc# 


Description 










[NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


35945257_cl_46 


| 835 2756 


67 


204 




Protein name 






Locus Name 


Acc# 


Description 










[NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


3906263_c2_55 


|837 2757 


| 1003 | 


(3012 | 5252 | 


0.0 


Protein name 






Locus Name 


Acc# 


lactotemn binding protein A 




|gp:AP043131 


AF043131 


Description 


Moraxella catarrnalis strain 4223 lactotemn binding protein B(lbpB) and 
lactoferrin binding protein A (lbpA) genes, completecds; and unknown genes. ' 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


3945388_r3_30 


838 2758 


413 


1242 |I219| 


|5.9e-l24 


Protein name 






Locus Name 


Acc# 


beta-ketoacyl-ACP synthase I 




|gp:PAU70470 


U70470 


Description 


Pseudomonas aeruginosa lemA-type sensor kinase/ response reguiatornomolog 
gene, partial cds, beta-hydroxy-ACP dehydrase (fabA) andbeta-ketoacyl-ACP 
synthase I (fabB) genes, complete cds. 



248 



ORF Name NT ID AAID — , — Score 
Length Length 



|400b250_c3_S8 839 2759 


163 1 


|492 | 572 


2.1e 


~ bb 1 


Protein name 






Locus Name 




Acc# 


riJDOSomal protein S12 : streptomycin 
resistance protein 




pir:A42939 




B42939:A42 
939:H64078 








Description 












ORF Name NTID AAID 


NT 
Length 


AA 

„ — , Score 
Length 


Probability 


|4093767_c3_63 | 840 | |2760 | 


|b44 | 


|U35 | |2854 | 


|3.2e- 


-297 | 


Protein name 






Locus Name 




Acc# 


unknown 






gp:AF043131 


AF043131 


Description 










Moraxella catarrnalis strain 4223 lactoterrm omding protein B(lbpB) and 
lactoferrin binding protein A (lbpA) genes, completecds ; and unknown genes. 


ORF Name NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


|4804632_t2_17 841 | 2761 | 


|485 


1458 572 


|2.1e- 


-55 | 


Protein name 






Locus Name 




Acc# 








sp : PABB_SALTY 




P12680 


Description 












PARA-AMINOBENZOATE SYNTHASE COMPONENT 1/ (ADC 


SYNTHASE) 




i 


ORF Name NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


|10i8_cl_12 | 842 | |27£2 | 


229 | 


|690 | |163 | 


|4.7e- 


i 


Protein name 






Locus Name 




Acc# 






i 


gp:B™B78P21 




X87092 


Description 












Bacteriopnage MB78 ORFs p2l, pll.5, 


p26 & p28 








1 


ORF Name NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


12303577_cl_ll |843 2763 


|105 


318 






Protein name 






Locus Name 




Acc# 



Description 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length - ■ 


Probability 


|i9B57893_c2JL7 


| |844 


| |2754 


1 1 


381 






Protein name 








Locus Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|20038305_c2_15 


if 45 


| |2765 


i p" i 


|228 | 






Protein name 








Locus Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


_ — . , Score 
Length 


Probability 


pi34555_clJL3 


II s46 


| |2766 


i u " i 


510 






Protein name 








Locus Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


23470625_C3_19 


II s " 


| |2767 


| 1.0 | 


|570 


287 


5.9e-25 



Protein name 



Description 



Locus Name 



gp : RP4TRAN0KP 



Acc# 
L10330 



Piasmia RP4 traN 
complete cds. 



gene, complete cds; traO gene, complete cds; KtrAgene, 



ORF Name 
|23fi32fli8_c3_2S 



NT ID AAID 
] |848 



NT AA 
— — Score 
Length Length 



] i 



Protein name 
Description 
MO-HIT 



Locus Name 



Probability 



Acc# 



250 



ORF Name 



131555 c3 20 



Protein name 



coat protein 



Description 



NT ID AAID 

1 



NT AA 

— — Score 

Length Length 

TJ2 1 [TTOT 



Locus Name 



bir:S58i42 



Probability 
|2.0e-17 



Acc# 

S58142 :T42 
283 



ORF Name 



Protein name 

Description 

NO-HIT 



NT ID 



AAID 



j343812S6_c3_21 | |850 | [2770 | 



NT AA 
Length Length 



— — Score Probability 



] EO 



Locus Name 



Acc# 



ORF Name 



34485537 c3 22 



Protein name 



Description 



NO-HIT 



NT ID AAID 



][ 



F5T" 



277T 



NT AA 
Length Length 
TT7 1 FTTT 



Score Probability 



Locus Name 



Acc# 



ORF Name 
|39079g3^cr7T" 



Protein name 
Description 



[NO-HIT 



NTID AAID 



852 



][ 



2772 



NT AA 
Length Length 

m 1 — 



Score 



Locus Name 



Probability 



Acc# 



ORF Name 
|4804763_cS7T5- 



Protein name 



Description 



NTID AAID 
] [2773" 



[FFT 



NT AA 

— — Score 

Length Length 

] EH 



122 



Locus Name 



Probability 



Acc# 



[NO-HIT 



251 



ORF Name 


NTID 


AAID 


NT 
Lencrfch 


AA 


Probability 


5928168_c2_18 


ii 854 


| |2774 


| 308 


927 




Protein name 








Locus Name 


Acc# 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


12625177_tl_2 


| 855 


| |2775 


1 1 148 i 


[447 | |435 | 


|7.0e-41 


Protein name 








Locus Name 


Acc# 



Description 



sp:DKaA_tl(L'OLl 



P18274 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


14241635_c2_80 


| 856 


| 2776 


1 1 


i 5 " i 




Protein name 








Locus Name 


Acc# 


Description 












NO-hit 1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


|1487688i_c2_82 


| |857 


2777 


256 


771 350 


7.2e-32 


Protein name 








Locus Name 


Acc# 



Description 



sp : EM<L H H_i4ACN0 



P17419 



POSSIBLE F1MBR1AL ASSEMBLY PR OTEIN EIMO (SEROGROUP Ml) 



ORF Name 



NTID 



AAID 



NT AA _ _ . , . _ . u 
— , — , Score Probability 
Length Length 



16181301 El 63 



2778 | [203 | [612 | [ 



|2.6e-06 



Protein name 



Locus Name 



sp:YSSH_HCOLI 



Acc# 
P32049 



Description 

ffTPOTHETTCTS 27.3 KB PROTEI N IN ANSB-MUTY INTERGENIG REGION (E233) 



252 



ORF Name 



NT ID 



AAID 



16898413 cl 7 7 



2779 



NT 
n 
TOT 



AA 

— Score 
Length Length 



[JUT 



Probability 
1286 J |9.1e-24 



Protein name 



Description 



Locus Name 



sp:Y712_HAEIN 



Acc# 
P44836 



PROBAHLU TONB- 


-DEPENDENT RECEPTOR. 


HI0712 PRECURSOR 




ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|21517252_tl_3 


850 2780 


I ™ I 


J1866 (1417 | 


|6.1e-145 


Protein name 






Locus Name 


Acc# 



Description 



sp:UVRC_PSEFL 



P32966 



| EXCINUCLfclASE ABC 


SUBUNIT 


C 








ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


j22004587_±2_24 


861 


2781 


333 


|1002 | |436 | 


|5.5e-4i | 


Protein name 








Locus Name 


ACC# 



Description 



'sp:YADB_ECOLI 



P27305:P75 
662 



Hypothetical 34 


§ kd protein in pcnb-dksa intergenic region 




ORF Name 


NT AA 
NTID AAID , — . , . — . , Score 
Length Length 


Probability 


|23597187_t3_61 


862 2782 | 769 J2310 539 


3.0e-100 | 



Protein name 



Description 



Locus Name 



sp:PRIA_RHORU 



ACC# 
P05445 



| PRIMOSOMAL PROTEIN N 1 (REPLICATION 


FACTOR Y) 






ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|23865681_r3_58 863 | 2783 


|739 


2220 |789 | 


|2.2e-78 | 


Protein name 




Locus Name 


ACC# 



spiSPOTJiAEIN 



P43811 



Description 

{ (PPUPP) A8E) (PENTA- PHOSPHATE GUANOS INU- 3 1 -PYROPHOSPHOHYD&OLASE) 



253 



ORF Name 


NTID 


AAID 


NT 


AA 

— , Score 
Lenath ■■ ■ 


Probability 


|244im7_c2_92 




|2784 


| 84 


255 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


— , , Score 
Length 


Probability 


25I0887_cl_76 


| 865 


| 278b 


1 60 1 


■F a3 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|26594686J:i_i4 


|866 


| p7»i 


215 I 


|e4. | | 


|2.ie-32 


Protein name 








Locus Name 


Acc# 



sp:YICG_ECOLI 



Description 

HYPOTHETICAL 22.0 KB IN RPH-(Mt IN T ERGEN1C RUG10N PRECURSOR 



P31432 :P76 
720 



ORF Name 



NTID 



AAID 



29304827 c2 94 



12787 



NT AA 
Length Length 



] [ 



Score Probability 
1286 I 



9.1e-24 



Protein name 



Description 



Locus Name 



sp:Y712_HAElN 



Acc# 
P44836 



PROBABLE TONB- 


-DEPENDENT RECEPTOR 


HI0712 PRECURSOR 




ORF Name 


NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


29484418_c2_9b 


|868 | 2788 


565 1701 |710 | 


|I.6e-I18 


Protein name 




Locus Name 


ACC# 



methyl trans terase 



gp:AF060119 



Description 



Pasteureiia naemoiytica metnyitransrerase (mod) ana restrictionendonuciease 
(res) genes, complete cds . 



254 



ORF Name 



NTID AAID 



30281911 c3 114 



869 



2789 



NT AA 

— — Score 

Length Length 

7T 



Protein name 



Description 



Locus Name 



sp:Y712_HAEIN 



Probability 
|9.le-2 4 

Acc# 
P44836 



PROBABLE TONB- 


•DEPENDENT RECEPTOR 


HI 07 12 PRECURSOR 






ORF Name 


NTID AAID 


NT AA 
Length Length 


Score 


Probability 


|340170i0_t2_35 


| 870 |2790 


| 307 | |924 | 


|708 | 


|8.3e-70 | 



Protein name 



Locus Name 



hypothetical protein t>2431 



pir :F65017 



ACC# 
F65017 



Description 

ORF Name 
p4407905_cT^T 
Protein name 



NTID 



AAID 



NT AA 
Length Length 



Score Probability 



] [1383 | [1801 | |i.2e-185 



Locus Name 



Li-z , 4-aiammoDutyrate : 2-Jcetogiutarate 



gp:AB001599 



ACC# 
AB001599 



Description 



AcinetoJDacter oaumannn dna torL-2 , 4-cliaminobutyrate : 2-ketoglutarate 
4 -aminotransferase , completecds . 



ORF Name 



NTID 



AAID 



NT 



AA 



Length Length 



Score Probability 



34641550 C3 119 



Protein name 



] [2792 | |185 | [551 | |12<5 | | i.2e-0 7 

Locus Name Acc# 



sp:AIL_YEREN 



P16454 



Description 
ATTACHMENT INVASION LOCUS PROTEIN PRECURSOR 



ORF Name 
p923818_t2_34 



NTID AAID 



27T 



[2TST 



NT AA 
Length Length 
Z7 



— ^, — Score Probability 



Protein name 
Description 
[NO-HIT 



Locus Name 



Acc# 



255 



ORF Name 
|4305455_crjF" 



NTID AAID 



2794 



NT AA 

— — Score 

Length Length 

TIT 



W7T 



TIT 



Probability 
|4.9e-07 



Protein name 



Description 



Locus Name 



sp : FMPi_PSEAE 



Acc# 
P17838 



FIMBRIAL PROTEIN 


PRECURSOR (PILIN) 


(STRAIN 


Pi) 




ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4329518_t3_56 


| |875 | |2795 


| 212 


i i i b4i i 


|4.ie-52 


Protein name 






Locus Name 


Acc# 



Description 



sp:KGUA_ECOLI 



P24234 



GUANYLATE 


KINASE, 


(GMP KINASE) 










ORF Name 




NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|4428413_cl_ 


_78 


876 2796 


517 


1554 


2128 


2 . 8e-220 



Protein name 



Locus Name 



L-2, 4-diaminobutyrate decarboxylase 



|gp:AGCL24DD 



Acc# 
D55724 



Description 



Acinetobacter oaumanmi gene tor L-2 , 4-diaminoJDutyratedecarboxylase, 
complete cds . 



ORF Name 



14538558 13 51 



NTID 



W7T 



AAID 



NT AA 

— — Score 

Length Length 

TTT 



] EHZl 



Probability 
1.6e-59 



Protein name 



Locus Name 



nypotnetical protein 



gp:PPPALl 



Acc# 
X74218 



Description 



Pseudomonas putida 


ruvB, 


tOlQ, tOlR, 


tOlA, 


tolB and 


oprL genes. j 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score Probability 


45703i8_t3_44 


| |878 


| |2798 | 


|228 


i p' 


| |448 | |3.0e-42 



Protein name 

Description 
PHOS PHOGL YCOLATE PHOSPHATASE, (PGP) 



Locus Name 



sp:GPH__HAEIN 



Acc# 
P44755 



256 



ORF Name 



14902305 C2 96 



Protein name 



NT ID AAID 



NT AA 
Length Length 

szv 1 fz&nr 



— , , Score 



Probability 
|5.3e-226 



Locus Name 



restriction endonuclease 



|gp:AF060ii9 



Acc# 
AF060119 



Description 



Pasteurella naemoiytica metnyltransterase (mod) and restnctionendonuciease 
(res) genes, complete cds. 



ORF Name 



NT ID AAID 



4954000 13 57 



NT AA 
Length Length 



Score 



95 



] EHZI 



Probability 
1.3e-14 



Protein name 



Description 



Locus Name 



sp:RPOZ_HAEIN 



Acc# 
P43740 



OMEGA CHAIN) (RNA POLYMERASE OMEGA ^UBUNIT) 


NT 

ORF Name NT ID AAID „ — . , 

Length 


AA 

. — . , Score 
Length 


Probability 


|5079408_13_60 |881 | 2801 | 151 | 


|456 |367 | 


l.le 


1 


Protein name 




Locus Name 




ACC# 


hypotnetical protein l (vntA 5 ' region; 


pir:3B44514 


B44514 


Description 










NT 

ORF Name NT ID AAID . — A , 

Length 


AA 

„ — . , Score 
Length 


Probability 


5080253_c2_89 |882 2802 552 | 


|177S |1142 | 


8.5e 


-116 | 


Protein name 




Locus Name 




Acc# 



Description 



sp : REG J_HAE IN 



P45112 



SINGLE -STRANDED- 


-DNA-SPECIFIC EXONUGLEASE REG J, 






ORF Name 


NT AA 

NT ID AAID , . , _ — 

Length Length 


Score 


Probability 


|533S7S2_c3_10S 


| 883 2803 325 578 


|1184 | 


|3 . 0e-l20 



Protein name 



Locus Name 



LytB 



gp:AF027185 



Acc# 
AF027189 



Description 



Acinetobacter sp. BD413 lytB, comB, come, comE, and comF genes, complete cds; 
and unknown genes. 



257 



ORF Name 



NT ID 



AAID 



97582 E3 42 



NT AA 
r — ^ T — Score 
Length Length 



[B70 | 



Probability 
1.3e-25 



Protein name 



Description 



Locus Name 



sp:ICC_ECOLI 



Acc# 
P36650 



ICC PROTEIN 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|13673437_c2_39 


1 I 555 


2805 


| p7 | 


|1791 | 


i 150 1 


p.5e-I3 



Protein name 



Locus Name 



putative termmase 



j |gp:AF147978 



Acc# 
AF147978 



Description 



Bacteriophage D3 putative termmase, putative portal protein, putative ClpP 
protease, and major head protein genes, complete cds,-and unknown genes. 



ORF Name 



14181252 c3 43 



NTID 



AAID 



][ 



NT AA 

— — Score 

Length Length 

] r~n [ 



JUT 



Protein name 
Description 
NO-HIT 



Locus Name 



Probability 



ACC# 



ORF Name 
11995312b c2 36 



NTID 



AAID 



][ 



OTT 



NT 
n 
GT5" 



AA 

— Score 
Length Length 



Protein name 
Description 



Locus Name 



Probability 



ACC# 



NO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


25665952_Cl_33 


| |888 


| 2808 


i 128 i 


| 38 7 | 




88 


0.0036 



Protein name 



Locus Name 



1 . 7 protein 



gp:BPH251805 



Acc# 
AJ251805 



Description 
Bacteriophage phi-Ye03-l2 complete genome. 



258 



ORF Name 



NTID AAID 



126575311 c2 li V 



7XUT 



NT AA 
Length Length 
TTJ7~ 



] EEEH 



Score Probability 
] |3.8e-10 ~ 



T¥5~ 



Protein name 



Locus Name 



hypothetical protein 



bp:XNE133022 



Acc# 
AJ133022 



Description 

Xenorhabdus nematopniius provirai ORF1 to ORFb . 



ORF Name 



I 326S7262 c'A 15 



NTID 
J |890 



AAID 



— — Score Probability 



NT AA 
Length Length 
| 150 | |453 | |209 | |S.7e-18 



Protein name 



Locus Name 



DNA primase 



pir:C41830 



Acc# 
C41830 



Description 



ORF Name 



13941887 ci 32 



Protein name 



NTID AAID 



inn - 



NT AA 
Length Length 
TFS 1 



— , Score 



TT5"" 



Probability 
TTUTS 



Locus Name 



|gp:P F A53l^ 



ACC# 
X17490 



Description 

Plasmodium falciparum mRNA tor asparagine-nch antigen (cloneb3Cb ) 



ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


5916253_ci_28 | |892 |2812 


|257 


|774 | 259 | 


3.1e 


-22 | 


Protein name 






Locus Name 




ACC# 








sp:YE22_HAEIN 


P44193 


Description 












HYPOTHETICAL PROTEIN HI 14 2 2 


ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


900215_c3_4i | |893 2813 


797 


2394 330 | 


|<T.7e- 


-27 | 


Protein name 






Locus Name 




Acc# 


putative DNA primase 




gp:AP139719 




AF139719 



Description 



Klebsiella oxytoca plasmid pACMl putative DNA primase (pri) gene , complete 
cds; and unknown genes. 



259 



ORF Name 


NT ID AAID 


NT 
Length 


— , Score 
Length 


Probability 


fra9'59527 f2 14 


~l 1894 1 12814 

1 r** 1 1 ° 


294 


885 513 


3.8e-49 


Protein name 






Locus Name 


Acc# 








sp:YBEX_ECOLI 


J P77392 


Description 










1 HYPOTHETICAL 33.3 


KD PROTEIN IN CUTE-ASNB lNTERGENlC REGION 




ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


1178127_t3_20 


| 895 | |2815 


963 


2892 | [2505 | 


|1.9e-281 | 


Protein name 






Locus Name 


Acc# 


SecA 






|gp:AB012226 


AB012226 


Description 


1 Vibrio alginolyticus gene tor SecA, complete 


eels . 




ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — , . Score 
Length 


Probability 


12298468_ri_i3 


| 895 | [2816 


102 


|309 82 


0.017 | 


Protein name 






Locus Name 


Acc# 


probable membrane protein L549.12 




|pir:T02800 


1 T02800 


Description 










ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|12985393_t3_16 


| |897 | |2817 


1 1™ i 


|813 | |537 | 


|l.le-51 | 


Protein name 






Locus Name 


Acc# 








|sp:PEPD_HAEIM 


1 P44817 


Description 










(PEPTIDASE D) 


ORF Name 


NT ID AAID 


NT 

Length 


AA 

— , Score 
Length 


Probability 


14538262_C2_40 


| 898 | |2818 


69 


|210 | |109 | 


|2.5e-06 


Protein name 






Locus Name 


Acc# 


nypotnetical protein APE04b8 




pir :A72741 


A72741 



Description 



260 



ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


I1454fi?fifl FT ^4 1 TUTQ 5515 





|735 | 265 


|7.3e 


1 


Protein name 






Locus Name 




Acc# 


hypothetical protein D1022.4 




pir:T34190 




T34190 


Description 












ORF Name NTID AAID 


NT ■ 
Length 


AA 

. — . , Score 
Length 


Probability 


|21640900 tl 8 |900 | |2820 


1 442 1 
1 1 


11329 | 11361 1 


|5.3e 


-139 | 


Protein name 






Locus Name 




Acc# 








sp:GSA_PSEAE 




P48247 


Description 












( GLUTAMATE - 1 - SEMI ALDEHYDE AMINOTRANSFERASE) 


[GSA-AT) 






ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24S0956i_c3_56 |90i | |282i 


i p 41 i 


|726 |147 | 


|2.4e 


-19 


Protein name 






Locus Name 




Acc# 








sp:UPi4_EC0LI 




P39179:Q46 
826 


Description 










UNKNOWN PROTEIN PROM 2D- PAGE (SPOT 


PR51) 








i 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|36329805_rl_2 |902 | |2822 


1 1" i 


|243 60 


0.019 | 


Protein name 






Locus Name 




ACC# 


thyroid hormone sultotransterase, B2 


pir:JC5885 




JC5885 


Description 












ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|41888ii_ti_6 903 2823 


159 


480 1310 1 


1.2e 


-27 


Protein name 






Locus Name 




ACC# 


conserved hypothetical protein 


pir :T03501 


T03501 



Description 



261 



ORF Name 
|4336018_tT7r 



NTID AAID 



UuT" 



2824 



NT AA 
Length Length 
TT5 — 



Score 



ITS" 



3T9~ 



Probability 
9.1e-32 



Protein name 



Description 



Locus Name 



sp:PHMA_ECOLI 



ACC# 
P16680 



PHNA PROTEIN 


ORF Name NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4S89i43_t2_15 905 | (2825 


| 175 


|528 | p2 


9.0e-08 | 


Protein name 




Locus Name 


Acc# 


apolipoprotem N-acyltransterase 




gp:AF038595 


AF038595 



Description 



Pseudomonas aerugxnosa apolipoprotem N-acyltransterase (cutE) gene, complete 
cds . 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|5205312_i3_2i 


906 


| 2825 


i m 


pio 1 






Protein name 










Locus Name 




Acc# 


Description 
















pTO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|22382752_cl_ll 


| 907 


2827 


1 117 1 


354 | |100 | 


|2.2e 


-05 


Protein name 










Locus Name 




Acc# 


nypotnetical 


protein 








pir:T10bll 


T10511 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|26251376_cl_8 


908 


2828 


1 P 


249 






Protein name 










Locus Name 




ACC# 



Description 
NO-HIT 



262 



NT 

ORF Name NTID AAID . — . , 

Length 


AA 

— , Score 
Length 


Probability 


J JlOO^IZ XI 1 1-7 \J -7 { £i \j £m J ~J ~f 


1 1758 1 248 


4 . 5e-2T 


Protein name 


Locus Name 


Acc# 


hypothetical protein slri97i 


pir:S75639 


| S75639 


Description 






NT 

ORF Name NTID AAID _ — 

Length 


AA 

. — . , Score 
Length 


Probability 


|35337805_cl_10 | |910 | |2830 | |185 


1 1 I 1 " 1 


|S.9e-12 | 


Protein name 


Locus Name 


Acc# 


suit ate transporter 


— | gp:D8963i 


D89631 


Description 


Araoidopsis tnaliana mRNA tor suitate transporter, complete 


cds . J 


NT 

ORF Name NTID AAID — 

Length 


AA 

. — . , Score 
Length 


Probability 


358014i6J:i_i | |91i | |283I | |255 


|768 | 379 


|6.ie-35 | 


Protein name 


Locus Name 


Acc# 




sp:RLUA_ECOLI 


1 P39219 


Description 






{PSEUDOURIDYLATE SYNTHASE) (URACIL HYDROLYASE ) 


NT 

ORF Name NTID AAID — _ 

Length 


AA 

. — . , Score 
Length 


Probability 


|4328403_c2_13 | (912 | |2832 | |109 


|330 | |19i | 


|5.1e-I5 | 


Protein name 


Locus Name 


Acc# 


BolA protein 


|gp:PFL243i74 


AJ243174 


Description 


Pseudomonas tluorescens partial Fumarase c 


gene, fcolA gene andORFl . 


NT 

ORF Name NTID AAID m — . , 

Length 


AA 

T — . i Score 
Length 


Probability 


4572206_ci_9 913 |2833 580 


|1143 391 


|i.5e-35 | 


Protein name 


Locus Name 


Acc# 


suitate transporter 


gp:AB008782 


AB008782 


Description 


| AraJDidopsis tnaliana mRNA tor suitate transporter, complete cds . 



263 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|I6990567_ti_7 


| 914 


2834 


61 


185 |109 | 


|8 . 8e 


-06 


Protein name 








Locus Name 




Acc# 


nemV protein 


pir:S54440 


S54440 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|I95345ii_c3_4b 


ii 915 


|2835 


70 | 


pi3 | 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


19573425_cl_31 


IP" 


J2836 


150 1 


i 






Protein name 








Locus Name 




Acc# 


Description 














(NO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


20408375_c2JH 


917 


2837 


50 


|183 | 






Protein name 








Locus Name 




ACC# 


Description 














[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Probability 


20930_C3_42 


918 


2838 


144 


435 |87 | 


J0.0070 



Protein name 

Description 
ATP SYNTHASE PROTEIN 1 



Locus Name 



sp:ATPZ_PSEPU 



ACC# 
P25760 



264 



ORF Name 



NT ID AAID 



20991307 c2 3b 



3TT 



NT AA 
Length Length 
— 



Score 



Probability 
B.Oe-14 



Protein name 



Locus Name 



periplasmic zinc transporter ZnuA 



gp:APi4197I 



Acc# 
AF141971 



Description 



Haemopnilus ducreyi HI0318 nomolog gene, partial eels ,-oxidoreauctase nomolog 
and periplasmic zinc transporter ZnuA (znuA) genes, complete cds; and 
ribose- 5 -phosphate isomerase A homologgene, partial cds. 



ORF Name 



NT ID 



AAID 



22129692 c2 37 



NT 
Length 



AA 



„ — . Score Pro babil ity 
Length u 



1 1 



I.6e-i92 



Protein name 



Locus Name 



H+- transporting ATP synthase, beta cnain 



[pir:D64071 



Acc# 
D64071 



Description 

ORF Name 
2450 7777 c3 43 



NT ID 

]EEI 



AAID 



][ 



NT 
Length 



AA 
Length 
11563 



Score 



[T9T9" 



Protein name 



Description 



Locus Name 



|sp:ATPA_ECOIT" 



Probability 
|1.7e-204 

ACC# 
P00822 



ATP SYNTHASE ALPHA CHAIN, 1 


ORF Name NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|25680i86_cl_28 922 


|2842 


1 ^ 1 


|4«i , |i71 | 


|4.3e-34 | 


Protein name 






Locus Name 


ACC# 



Description 



sp:ATPF_VlBAL 



P12989 



ATP SYNTHASE B 


CHAIN, 










ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


34023378_cl_26 


923 


| 2843 


295 | 


|891 | 745 


9.9e-74 | 


Protein name 








Locus Name 


Acc# 



sp:ATP6_EC0LI 



Description 
ATP SYNTHASE A CHAIN, (PROTEIN 6) 



P00855 :Q47 
708 



265 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


p J- -7 *± \J \J V— -J J Z? 


RTTZ 


| 2844 | 


55" " 


198 |65 | 


|0. 0045 | 


Protein name 








Locus Name 




ACC# 


extensin nomolog 


F2401 . 18 






pir :T01456 


T01456 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|3953587_r3_23 


IP* 


|| 28 45 | 


|174 | 


|525 | |235 | 


|8.6e 


-20 | 


Protein name 








Locus Name 




ACC# 










sp:ZTO_ECOLI 




P32692 :P76 
784 


Description 












ZINC UPTAKE REGULATION PROTEIN (ZINC 


UPTAKE REGULATOR) 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4120425_cl_29 


| 925 


[2846 | 


202 | 


|609 | 258 | 


|3.5e- 


1 



Protein name 



Description 



Locus Name 



[spT 



ATPD VIBAL 



Acc# 
"J P12987 



| ATP SYNTHASE DELTA CHAIN, 


ORF Name NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|433294^_cl_27 |927 


|2847 


84 


255 251 


|1.9e-22 | 


Protein name 






Locus Name 


ACC# 








sp : ATPL_HAEIN 


P43721 


Description 










(DICYCLOHEXYLCARBODIIMIDE- 


-BINDING 


PROTEIN) 






ORF Name NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|56333_c3_44 | 928 


|2848 


1 P 09 1 


|930 | 894 | 


|1.6e-89 


Protein name 






Locus Name 


Acc# 



|sp:ATPG_E(L , OLI 



Description 
A T P SYN T HASE GAMMA CHAIN, 



P00837:P00 
838 



266 



ORF Name 



|70b^>441 c2 38 



NTID 
|929 



AAID 



12849 



NT AA 
Length Length 
— 



3TT 



Score 

EH 



Probability 
|i.4e-26 



Protein name 



Description 



Locus Name 



[sp:ATP E _HAEIN 



Acc# 
P43718 



ATP SYNTHASE 


EPSILON CHAIN, 








i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|992B43i_r2_17 


■ | 930 | 


|2S50 


1 71 1 


I 216 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


3235950_c2_28 


|931 


2851 


| 551 


1686 |2096 | 


|6.8e-217 


Protein name 








Locus Name 


ACC# 



urocanase 



gp:P5EHUTUU 



Description 

pseuaomonas putida urocanase (nutu; gene, complete cds. 



M33923 :M28 
362 



ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


3940938_c2_30 932 | |2852 


| 357 


|1074 | 355 


i.4e-33 


Protein name 




Locus Name 


Acc# 






sp:HUTG_KLEAE 


P19452 


Description 








(HISTIDINE UTILIZATION PROTEIN «J 


(FRAGMENT) 






ORF Name NTID AAID 


NT 
Length 


AA 
Length 


Probability 


|3953181_c3_34 | 933 | |2853 


i i 


|1305 | |1007| 


|1.7e-101 


Protein name 




Locus Name 


Acc# 



E 



:YP102KB 



AL031866 



Description 

Yersinia pestis 102 kbases unstable region: trom 1 to 119443 



267 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4B2218iJt3JL4 


934 


2854 


78 


Q37 1 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NTID 


AAID 


"NTT 

IN J. 

Length 


AA 

_ — . , Score 
Length 


Probability 


789037_c3_33 


||935 


| 2855 | 


525 | 


|1578 | [I486 | 


|3.0e 


-152 


Protein name 










Locus Name 




Acc# 


histidine ammonia- 


lyase, : 


histidase 






pir:A35251 


A35251:S39 
381 


Description 
















NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


828942_tl_l 


936 


2856 


m | 


|89I | |180 | 


|7.0e 


-12 


Protein name 










Locus Name 




ACC# 












sp:YYAM_BACSU 


P37511 


Description 
















HYPOTHETICAL 32.9 


KD PROTEIN IN TETB 


-EXOA INTERGENIC REGION 




1 


ORF Name 


NTID 


AAID 


NT 

Length 


AA 

. — . , Score 
Length 


Probability 


10978400_c3_163 


| 937 


|2857 


253 | 


|762 | 186 | 


|1.7e 


1 


Protein name 










Locus Name 




Acc# 












sp:HEM4_PSEAE 


P48246 


Description 
















) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— . , Score 
Length — ■■- 


Probability 


1259626_c2_127 


938 


2858 


614 | 


1845 |1756 | 


|7.3e- 


-181 


Protein name 










Locus Name 




Acc# 












sp:YA51_HAEIN 





Description 

HYPOTHETICAL ABC TRANSPORTER ATP- BINDING PROTEIN HI1051 



Q57180:O05 
043 



268 



ORF Name 



NT ID AAID 



12672305 C2 140 



NT AA 
Length Length 

rnu — 



Score 



TIT 



Probability 
4 . 8e-10 



Protein name 



Description 



Locus Name 



]sp:YGGX_HAEIN 



Acc# 
P44048 



HYPOTHETICAL PROTEIN H10V60 


ORF Name NTID AAID 


NT 
Length 


AA 

T — ■ i Score 
Length 


Probability 


13I3I990_t3_71 940 | |2860 


||U) 


|blO | |14V | 


|2.7e-10 | 


Protein name 




Locus Name 


Acc# 



Bp:DSBC_ E RWCH 



P39691 



Description 

T H I OL: DISULFIDE IN T ERCHANGE PROTUIN D 5BC PRECURSOR 



ORF Name 



NTID AAID 



114847290 El 89 



m 1 [2861 | 



NT AA 
Length Length 
51 



— . , Score 



186 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



NO-HIT 



ORF Name 



NTID AAID 



15875832 t3 92 



942 [ [2862 | [ 



NT AA 

— — Score 

Length Length 

T5 1 \27T 



|228 | 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



(NO-HIT 



ORF Name 
|16453S9i_c3_16B 



NTID AAID 



— , — , Score Probability 



"9TT 



NT AA 
Length Length 
ITS 



F 7 I 



Protein name 
Description 



Locus Name 



ACC# 



[NO-HIT 



269 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|I66i0052_c3_i83 


944 


| 2364 


93 


282 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|1957701_rl_30 




| 2865 


|277 | 


|834 | 329 | 


|1.2e-29 | 


Protein name 








Locus Name 


Acc# 



Description 



sp : GRPE_HAEIN 



P43732 



GRPE PROTEIN 1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


19720930J:3J74 


| 946 


| |2866 


276 | 


831 597 


4.8e-58 | 


Protein name 








Locus Name 


Acc# 



sp:DAPB_ECOLI 



P04036 



Description 

I DIHYDRODIPICOLINATE REDUCTASE, 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|21489390_t2_59 


1 l y47 


| pw 


i r i 


p04 | 




Protein name 








Locus Name 


Acc# 


Description 












pTO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|21689063_tl_10 


948 


2868 


414 


1245 328 | 


i.5e-29 | 


Protein name 








Locus Name 


Acc# 



|gp:AF176824 



AF176824 



Description 



Synechococcus PCC7942 plasmid. pANL O-acetylserine (thiol ) -lyase SrpD(srpD), 
gamma-glutamyltranspeptidase SrpE (srpE) , alpha-helicalcoiled-coil protein 
SrpF (srpF) , SrpJ (srpJ) , ATP-binding proteinof ABC transporter SrpK (srpK) , 
membrane lipoprotein SrpL (srpL),and cytoplasmic membrane protein SrpM (srpM) 
genes, complete cds . 



270 



ORF Name 



NT ID AAID 



2204217V T2 6b 



NT AA 
Length Length 
TTS 1 11482 



Score Probability 
11203 I |2.9e-122 



Protein name 



Locus Name 



argininosuccinate lyase argH 



pir :C69589 



Acc# 
C69589 



Description 

ORF Name 
122070191 tl 5 



NT ID 

]EfH 



AAID 



AA 

— Score 
Length Length 



][ 



NT 



Probability 



] i 



1230 



p5~ | | 2.8e-46 



Protein name 



Locus Name 



cystathionine-gamma- lyase 



gp:AF180145 



Acc# 
AF180145 



Description 



Zymomonas mobilis GTP-binding protein CgpA (cgpA) , 6 OKD inner -membrane 
protein yidC (yidC) , hypothetical protein, glutamine -pyruvate aminotransferase 
gltB (gltB) , glutamate synthasesmall subunit gltS (gltS) , undecaprenol kinase 
udk (udk) , hypothetical protein, NADH dehydrogenase, hypothetical 
protein; zml2orf 5 , hypothetical protein, aspartate aminotransferase 
A f beta -hydroxys teroid dehydrogenase, phosphomannomutase pmm 



ORF Name 



122922082 c2 138 



Protein name 
Description 



NTID 



AAID 



NT AA 
Length Length 
JT5 1 ISSTT 



Score Probability 



Locus Name 



ACC# 



NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


23470i_c3_187 


952 


2872 


73 


|222 | |115 | 


5.1e 


-07 


Protein name 








Locus Name 




Acc# 


extensin 


pir:a22697 




Description 












S22697 :S21 
006 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

r — , i Score 
Length 


Probability 


23572127_ti_3i 


953 


2873 


1 "6 | 


1911 |2307| 


|3.0e- 


-239 | 



Protein name 

Description 
DNAK PROTEIN (HEAT SHOCK PROTEIN 70) (HSP70) 



Locus Name 



sp:DNAK_FRATU 



Acc# 
P48205 



271 



ORF Name 



NTID AAID 



I2390968B t3 77 



NT AA 

— , ^ — Score 

Length Length 

i 



Probability 
|4.2e-43 



Protein name 



Locus Name 



rubredoxm--NAD+ reductase, :Jiypotnetical 
protein hydA 3 ' -region 



fpir:«5051 



Acc# 
C65051 



Description 

ORF Name 
124073762 c3 ib8 



NTID 



AAID 



][ 



NT AA 
Length Length 



Score 



Probability 
|S.0e-S7 



Protein name 



Locus Name 



AvtA 



Description 



E 



:AF014804 



Acc# 
AF014804 



Neisseria meningitidis PgiB (pgTBl , PglC [pgTCl , PgiD ipgiD) , andAvtA (avtA) 
genes, complete cds. 



ORF Name 
|24100455_t2_45 



NTID 



AAID 



][ 



NT AA 

— — Score 

Length Length 

J7T 



Probability 
1.3e-S4 



Protein name 



Locus Name 



intrinsic membrane protein 



|gp:AB000100 



Acc# 
AB000100 



Description 



Synechococcus sp. 
cyanase, complete 


DNA tor intrinsic membrane protein, malK-HKeprotein, 
cds . 


ORF Name 


NT AA 
NTID AAID , — . , . — . , Score 
Length Length 


Probability 


24391877_cl_109 


957 |2877 | 329 990 927 


b.le-93 


Protein name 


Locus Name 


Acc# 




|sp:HEM3_EC0LI 


1 P06983.-P78 


Description 




125 


| SYNTHASE) (HMBS) 


{ PRE -UROPORPHYRINOGEN SYNTHASE) 




ORF Name 


NT AA 
NTID AAID . — . , . — . , Score 
Length Length 


Probability 


24417807_t3_75 


958 | 2878 185 558 253 


|1.2e-22 | 



Protein name 



Locus Name 



UIpT 



gp:S71704 



Acc# 
"I S71704 



Description 



272 



ORF Name 



12461900^ h'l 79 



Protein name 



NTID AAID 



2879 



NT 
237 



AA 

— Score 
Length Length 



Probability 
5.2e-61 



Locus Name 



sp:NRTC_SYHY3 



Acc# 
P73450 



Description 
NI T RATE T RANSPOR T A T P-BIMDING PROTEIN NftTC 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


260532b0_cl_112 




|2880 


268 | 


|807 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


26053250_c2_134 


IP 61 


| |2881 


116 | 


351 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


26362680_c2_129 


962 


| 2882 


296 


|591 | |480 | 


|1.2e-45 | 


Protein name 








Locus Name 


Acc# 



Description 



sp:VJPH_HAEIN 



P44906 



HYPOTHETICAL 


TRNA/RRNA METHYLTRANS FERASE HT0860, 




ORF Name 


NT AA 
NTID AAID . — . , . — . , Score 
Length LeuyLh 


Probability 


2928382J:2_40 


|963 |2883 323 |972 | |895 | 


1.3e-89 | 



Protein name 



Locus Name 



sodium- dependent transporter homolog yocS 



pir :E6yyo2 



Acc# 
E69902 



Description 



273 



ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



29339432 c3 166 



1964 



2884 



322 



F^n f i [ 



Probability 
|4.2e-05 



Protein name 



Locus Name 



hypothetical protein b2755 



pir :G65056 



Acc# 
G65056 



Description 

ORF Name 
|29952837_c3_181 
Protein name 

Description 
DNAJ PROTEIN 



NTID 



AAID 



NT AA 
— — Score 
Length Length — 



Probability 



12885 



] | 857 | | 117 | |9-9e-13 



Locus Name 



sp:DNAJ_SYNP7 



Acc# 
P50026 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


|32S29186_c2_139 


|9££ 


| |288£ 


i i 


672 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|3307_t2_47 


957 


| |2887 


i FS 


|360 | 349 


9.1e-32 | 



Protein name 

Description 
HYPOTHETICAL PRO T EIN HI1723 



Locus Name 



sp:YADR_HAEIN 



Acc# 
P45344 



ORF Name 



34492161 c3 168 



NTID 



AAID 



NT AA 
Length Length 
|353 



Score Probability 



TUZT 



Protein name 
Description 
NO-HIT 



Locus Name 



Acc# 



274 



ORF Name 



13940925 c3 182 



Protein name 



NTID AAID 



1969 



2889 



NT 
n 
ITT 



AA 

— , Score 
Length Length 



Probability 
3.7e-44 



Locus Name 



sp:YHET_ECOLI 



ACC# 
P45524 



Description 

HYPOTHETICAL 38.5 KB PROTE I N IN KITO-PRKB IMTERBEHIC REGION 



ORF Name 


NTID 


AAID 


NT 
Length 


, — . , Score 
Length 


Probability 


409S443_c3_170 


| |970 


| |2890 


1 I 161 


|48S | |147 | 


2.3e-10 | 


Protein name 








Locus Name 


Acc# 


hypothetical protein Rv0163 




pir:G70903 


G70903 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4187538_r2_50 


1 F 1 


| |289i 


473 


J1422 | 1141 


|i.ie-ii5 | 


Protein name 








Locus Name 


Acc# 










sp:MPL_HAEIN 


P43948 


Description 












LIGA^E, | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


|4328135_tl_13 


| |972 


| |2892 


i 474 i 


(1425 |812 | 


|7.9e-81 | 



Protein name 



Locus Name 



periplasmic substrate binding protein 



Description 



:AF001333 



Acc# 
AF001333 



Synechococcus PCC7942 periplasmic substrate binding protein (cynA) , integral 
membrane protein (cynB) and ATP-binding protein (cynD) genes, complete cds . 



ORF Name 



NTID AAID 



14328955 13 99 



\T7T 



NT AA 
Length Length 
TT2 1 



— . , Score 



T9S~ 



Probability 
1.2e-24 



Protein name 



Locus Name 



sp:Y117_HAEDU 



Acc# 
030825 



Description 

I HYPOTHETICAL PROTEIN HYPO! 17 



275 



ORF Name 



NT ID AAID 



4423318 c3 160 



AA 

— , Score 
Length Length 

12307 



NT 
n 



TF8T" 



Probability 
|3.0e-239 



Protein name 



Locus Name 



93% identity over 631 ammo acids with E. 
coli 



tap : STYSTMFi 



Acc# 
AF170176 



Description 
Salmonella typhimurium rragment STMFl . 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


4454512 fl~32 


1 975 I 12895 


— 1 ill" 1 

II 111 1 


|336 | |ii3 


9.3e 


-07 


Protein name 








Locus Name 




Acc# 










sp:Y173_HAEIN 


P43960 


Description 














HYPOTHETICAL 


PROTEIN HI0173 












ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score . 
Length 


Probability 


454S1fifl fl — 


1 T7Z 1 12895 


421 


1256 1342 


|5.4e 


-137 


Protein name 








Locus Name 




Acc# 










sp:DADA_ECOLI 


P29011 


Description 














D-AMINO ACID 


DEHYDROGENASE SMALL 


SUBUNIT, 










ORF Name 


NT ID AAID 


NT 
Length 


AA 

- — . , Score 
Length - 


Probability 


5119127_c2_153 


| |977 | 2897 


1 1 


1956 | |1464 


6 .4e- 


-150 


Protein name 








Locus Name 




Acc# 










sp:YHES_ECOLI 




P45535 


Description 














HYPOTHETICAL 


ABC TRANSPORTER ATP- 


-BINDING PROTEIN YHES 






ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|6727085_rl_17 


| |978 | |2898 


563 


|1692 |879 | 


|6 .3e- 


-88 



Protein name 



Locus Name 



putative gamma-glutamylcysteme synthetase 



gp:PSP243941 



ACC# 
AJ243941 



Description 

Pseudomonas sp. strain HR199 partial vanB, tdh, gcs, ehyA and enyBgenes. 



276 



ORF Name 



1673253 t3 72 



Protein name 

Description 
DMAJ PROTEIN 



NT ID AAID 



NT AA 

— ^_ _ — __ Score Probability 
Length Length 



2899 



|3.2e-116 



Locus Name 



sp : t>NAJ_& ALTY 



Acc# 
Q60004 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


S854S77_c3_167 


| |980 


| 2900 


1 4 " 1 


|1389 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


8014!>2_iy_60 


n si)i 


| 2901 


1 1 


p 31 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


884712_ci_113 


982 


| |2902 


1191 


|3576 | |79 | 


jO.0031 



Protein name 



Locus Name 



hypothetical protein PH1246 



pir:A7i069 



ACC# 
A71069 



Description 



ORF Name 

|91S633_clTIB" 

Protein name 
Description 

NO-HIT 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



] EO 



Locus Name 



Probability 



Acc# 



277 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


[9756888 tl 18 


984 


|2904 


182 


l b4y 1 


229 


4 . 8e 


-19 


Protein name 










Locus 


Name 




ACC# 












sp:YA21_PSEAE 


P21482 


Description 


















HYPOTHETICAL 17 


8 KD PROTEIN IN ALGR2 5 ' REGION 








i 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|10329680_tl_3 


[985 


| |290£ 


1 1 


\1{>'±S | 


|587 | 


p~3e 


-57 j 


Protein name 










Locus 


Name 




Acc# 



gp : PSHOPRC 



D28119 



Description 

Pseudomonas aeruginosa oprc gene tor outer membrane protein c, complete cds . 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — , , Score 
Length 


Probability 


|10S26550_c3_128 


985 


p^u£ | 




i 158 i 




Protein name 








Locus Name 


Acc# . 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


Ii017010_c3_139 


| 987 


2907 | 


247 | 


|744 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|12894752_tl_l!} 


1 P y 


|2908 | 


i" i 


|210 | |84 | 


|0.0I6 


Protein name 








Locus Name 


ACC# 


conserved Hypothetical prot 


em 




[pir:A72221 


A72221 



Description 



278 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 




| 989 


2909 


88 


1257 1 

r 1 




Protein name 
Description 








Locus Name 


Acc# 


NO-HIT 










1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— . , Score 
Length 


Probability 


14533433_t3_Sb 


| |990 


| |2910 


1 P a 1 


p« i 




Protein name 
Description 








Locus Name 


Acc# 




[NO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


16023402_c3_i2!> 




| |29ii 


j 887 


|26£4 | |14i0 | 


|6.3e-160 | 


Protein name 








Locus Name 


Acc# 










sp:FTSK_COXBU 


P39920 


Description 












CELL DIVISION PROTEIN FTSK HOMOLOG 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|19537930_ti_i 


| |992 


2912 


97 


|294 | 116 


|4.5e-07 | 


Protein name 








Locus Name 


Acc# 


hypotnetical prote 


in APE0900 




pir :D 7268b 


D72685 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


19Ui2b00_tl_i4 


ip>» 


| |2913 


1 F* 1 


|2925 |2125| 


5.8e-220 | 


Protein name 








Locus Name 


Acc# 



Description 



sp:DPOi_HAHlU 



P43741 



DNA POLYMERASE I, {POL I) 



279 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|2008433__cl_102 


| |994 2914 


| 287 | 


864 




Protein name 






Locus Name 


Acc# 


Description 












ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|20505502_c2_10b 


| |995 | (2915 


i p i 


|189 | 




Protein name 






Locus Name 


ACC# 


Description 










tttfO-HIT 


ORF Name 


NT ID AAID 


"NTT 
IN X 

Length 


, — . , Score 
Length 


Probability 


|20S8i377_c3_i27 


| 996 2916 


883 


2667 |2717| 


|l.le-282 | 


Protein name 






Locus Name 


ACC# 


DNA topoisomerase, 






pir:(364119 


G64119 


Description 










ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|213i550_ii_9 


| 997 2917 




|348 1 153 


|5-4e-li | 


Protein name 






Locus Name 


Acc# 


ptern.n-4-alplia-carJDinolamine 




pir :S74881 


S74881 


dehydratase : protein 


ssl2296 rprotein 


SS12296 








Description 










ORF Name 


NT ID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|22119402_r3_77 


| |998 | |2918 


i p" i 


|789 | |49i | 


|8.2e-47 | 


Protein name 






Locus Name 


ACC# 








sp:OCCM_AGRTl 


P35115 


Description 










OCTOPINE TRANSPORT 


SYSTEM PERMEASE 


PROTEIN OCCM 


i 



280 



ORF Name 



NT ID AAID 



254408 c2 111 



NT AA 
Length Length 
TTZ 



Score 



T73~ 



Probability 
S.3e-24 



Protein name 

Description 

I METHYLTRANSFERASEJ 



Locus Name 



sp:RRMA_ECOLI 



Acc# 
P36999 



ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


' Probability 


|23928i30_t3_66 


1000 | 2920 


1 " 1 


I 192 1 




Protein name 






Locus Name 


Acc# 


Description 










pjO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24256553_t2_3I 


1001 2921 


| 751 


|22B6 | |2080| 


|3.4e-215 | 


Protein name 






Locus Name 


Acc# 


DNA topoisomerase 


IV 




gp:AB023570 


AB023570 


Description 


Vibrio parahemolyticus pare gene 


tor DNA topoisomerase IV, complete cds . 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|24645;7&3_ci_91 


|1002 | 2922 


1 l m 1 


|891 | |255 


|8.4e-22 | 


Protein name 






Locus Name 


Acc# 


hypothetical protein jhpllSS 




"| pir:G7184i 


G71841 


Description 










ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|24881313_r2_32 


1003 1 2923 


i p° i 


p" i 




Protein name 






Locus Name 


Acc# 



Description 
NO-HIT 



281 



ORF Name 



NTID AAID 



25417950 c3 134 



TOW 



NT 
Length 
TF2 



AA 

Length Score Probability 



7TT 



1.8e-I7 



Protein name 



Description 



Locus Name 



sp:RRMA_ECOLI 



Acc# 
P36999 



METHYLTRANSFERASE) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|31406308_c2_113 


|1005 


| 2925 


| 114 


pu | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


314B5625J:3J72 


|1006 


| |2925 


ii" 5 


|828 | 427 


|5.0e-40 


Protein name 








Locus Name 


Acc# 



Description 



gp:AB032934 



AB032934 



Vibrio alginolyticus ptsA, ortC, ortD genes 
proteins, complete cds . 



tor PF60 andnypotnetical 



ORF Name 



134025462 Cl 104 



NTID AAID , 
11007 



NT 
Length 
S3 



AA 
Length 



Score Probability 



Protein name 
Description 

WttTTT 



Locus Name 



ACC# 



ORF Name 



NTID AAID 



134085012 ci 97 



11008 



292S 



NT 
Length 
TTS 



AA 

Length 



Probability 



Protein name 



Description 



Locus Name 



Acc# 



-HIT 



zi 



282 



ORF Name 



NTID AAID 



35360075 ti 2 



TuW 



2929 



NT AA 
Length Length 
1 



Score Probability 
] |0.(m — 



I7T 



Protein name 



Locus Name 



net protein 



gp:AF169778 



Acc# 
AF169778 



Description 



HIV-1 isolate G221 trom India net protein (net) gene, 
long terminal repeat, partial sequence. 


partial 


cds ; and 3 ' 


NT AA 

ORF Name NTID AAID _ — . , . — 

Length Length 


Score 


Probability 


361i0253_t2_36 | |10I0 | |2930 | 108 | |327 | 


r« i 


|0 . 0077 | 



Protein name 



Locus Name 



outer surtace protein A 



gp:BBPWUDII 



Acc# 
X68539 



Description 

B.burgdorten (PWudll) plasmid OspA gene tor outer surtace proteinA. 



ORF Name 



3945893 F3 71 



NTID AAID 



NT AA 
„ — L1 , — ^, Score Probability 
Length Length 



Protein name 



Description 



] |285 | |861 | |3.3e-82 

Locus Name Acc# 



lgp:AB032934 



AB032934 



Vibrio alginolyticus ptsA, 
proteins, complete cds. 


oriC, 


ortD genes 


tor PF60 andhypothetical 


ORF Name NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4337963_t2_5i 1012 


2932 


2S8 


807 434 


9.0e-41 


Protein name 






Locus Name 


Acc# 








gp:AB032934 


AB032934 


Description 










Viorio alginolyticus ptsA, 
proteins, complete cds. 


ortc, 


ortD genes 


tor PF6 0 andhypothetical 


ORF Name NTID 


AAID 


NT 
Length 


AA 
Length 


Probability 


4691525_J:i_7 |1013 


2933 


i ■» i 


pi | 





Protein name 
Description 
[NO-HIT 



Locus Name 



Acc# 



283 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


1 / J J J J J -L -L W 


11714 2934 


1570 1 
1 1 


[1713 | |1749 | 


|4 . Oe 


-180 


Protein name 








Locus Name 




Acc# 










sp:RF3_HAEIN , 


P43928 


Description 














PEPTIDE CHAIN 


RELEASE FACTOR 3 (RF- 


-3) 










ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|5102193_r2_b4 


1015 | |2935 


| 372 | 


|1119 | |897 | 


|7.8e- 


-90 


Protein name 








Locus Name 




ACC# 



sp:GLMU_HAEIN 



P43889 



Description 

ACETVLGUTCOSAMIHB-1- PHOSPHA T E URIDVLTRAHSPHRASli) 



ORF Name 



5111013 c3 126 



NTID 
] [1016 



AAID 



NT 



AA 



Length Length 



Score Probability 



|?336 | 



pur 



] 



Protein name 



Description 



Locus Name 



ACC# 



NO-HIT [ 


ORF Name NTID AAID 


NT 
Length 


AA 
Length 


Probability 


|651&518_ri_25 | |1017 | |2937 


1 P 4i 1 


|732 | |499 | 


|1.2e-47 | 


Protein name 






Locus Name 


ACC# 








sp:NOCQ_AGRT5 


| P35118 


Description 










NOPAL INE TRANSPORT SYSTEM PERMEASE 


PROTEIN NOCQ 






ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


5820875_cl_100 1018 | |2938 


1 P 4b 1 


|1038 | 159 


|8.8e-09 | 


Protein name 






Locus Name 


ACC# 


apolipoprotem A- IV precursor 




pir :C40892 


C40892 



Description 



284 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


807592 c3 137 




1 2939 " 
| 


553 | 


1692 171 


1.2e 


-08 


Protein name 








Locus Name 




Acc# 


Tnp2 3 0 








gp:AF007217 




AF007217 


Description 














Homo sapiens Tripoli u mRNA, 


complete 


cds . 








ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|978502_£2_52 


| |I020 


| 2 940 


|256 


1801 1 1430 1 


|2 .4e 


-40 


Protein name 








Locus Name 




ACC# 










gp:AB032934 




AB032934 


Description 














Vibrio algmolyticus prsA, 
proteins, complete cds. 


ortc, ortD genes 


tor PF60 andhypothetical 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|1005S500_ti_3 


|102i 


| 2941 


78 J 


|237 | |163 | 


,4.7e- 


-12 


Protein name 








Locus Name 




Acc# 


nypotnetical protein HI0187 




pir:B54145 




B64145 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


1055S455_c3_200 


|1022 


| 2942 


453 1 


|1392 | |831 | 


7. 7e- 


-83 | 


Protein name 








Locus Name 




ACC# 










sp:YWBN_BAOSU 




P39597 


Description 














HYPOTHETICAL 45.7 


KD PROTEIN IN EPR- 


•GALK INTERGENIC REGION PRECURSOR 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


12503450_ti_20 


| |1023 


| 2943 


558 


|1S77 | 1252 


1.6e- 


-128 | 


Protein name 








Locus Name 




Acc# 










sp:PILB_PSEAE 




P22608 


Description 














FIMBRIAL ASSEMBLY 


PROTEIN 


PILE 











285 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


fi"2978955 f 2 "43" 


[1024 


| 2944 


| 274 


P 5 1 


609 


2 .6e 


-b9 


Protein name 








Locus 


Name 




Acc# 










sp:YH25_AZOCH 




P54085 


Description 
















(0RF5) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|13679786_t3_112 


| |1025 


| 2945 


1 iiU 1 


P » | 


i 155 1 


|1.2e 


-05 



Protein name 



Locus Name 



Hypothetical protein 



|gp:BSZ75208 



Acc# 
Z75208 



Description 



B. SUbtillS 


genomic 


sequence 89009£>p. 










ORF Name 




NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


13712643_c3_ 


_i87 


1026 2946 


i 2i! i 


643 


113 


0.00040 | 



Protein name 



Locus Name 



conserved hypothetical protein 



|pir:B75483 



Acc# 
B75483 



Description 



ORF Name 



114191915 t2 69 



NTID 
] [1027 



AAID 



] [2947 | 



NT 
Length 
|365 



AA 

_ — Score 
Length 



] [1098 | 



Probability 
J |2.5e-34 ~ 



Protein name 



Locus Name 



conserved hypothetical protein ylt>K 



lpir:H69874 



ACC# 
H69874 



Description 

ORF Name 
| 1425i568J:2_ 7 2 

Protein name 



NTID 



AAID 



— . , _ — . , Score 



TU7W 



NT 
Length 



AA 
Length 



|207 | 



Locus Name 



Probability 
ll.Oe-16 

Acc# 



hypothetical protein APE1486 



pir :F72628 



F72628 



Description 



286 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|14492180_c2_149 




| 2949 


r 


P 04 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


14511561_c3_197 


|1030 


| 2950 


pi9 | 


|3960 | |3473 | 


|0.0 | 



Protein name 



Locus Name 



phosphoribosyltormylglycinamiaine 
synthase, : f ormylglycinamide ribonucleotide 
synthetase : phosphor ibosyl f ormylglycinamidine 



pir:SYECP<5 



Acc# 

D65033 :A31 
862 :A34192 



Description 



ORF Name 



14878927 t2 42 



NTID 

]EIE 



AAID 



][ 



NT 
Length 
T7T 1 



AA 

— , Score Probability 
Length 



Protein name 



Description 



|815 | |294 | |4.8e-39 
Locus Name 



sp:HI52_AQUAE 



Acc# 
067780 



ORF Name NTID AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


14882750_c2_164 | |1032 | |2952 | 


|409 | 


1230 585 | 


|9.0e 


1 


Protein name 






Locus Name 




ACC# 


putative membrane transport protein. 




gp:SCC75A 


AL133220 


Description 










Streptomyces coelicolor cosmid C75A. 




ORF Name NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


14970637_c3_i76 1033 2953 


237 | 


|714 | 791 | 


|1.3e 


-78 | 


Protein name 






Locus Name 




Acc# 



Description 
PROTEIN F21.5) 



sp : CLPP_ECOLI 



P19245 



/ 



287 



ORF Name 



NT ID 



AAID 



15831636 c2 158 



TUTT 



NT AA 
Length Length 
TT7 



Score 



Protein name 



Locus Name 



Acritlavm resistance protein D. 



gp:D90846 



Description 



Probability 
1.5e-19 



Acc# 

D90846:AB0 
01340 



E.coli genomic DNA, 


Konara clone 


#357(46.5- 


46.8 mm. 






ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|165842_c2_157 | 


1035 | 2955 


1 I 205 


i i 419 


| 332 | 


5.8e-30 



Protein name 



Description 



Locus Name 



sp : N0LH_RH1ME 



Acc# 
P25198 



MODULATION 


PROTEIN NOLH PRECURSOR 










ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


16687526_ti_ 


10 1036 


2956 


448 


1347 


|1082 | 


|1.9e-109 



Protein name 



Description 



Locus Name 



sp:ARGA__ECOLI 



Acc# 

P08205:O68 
009:068010 
:O68011:O6 



SYNTHASE) (AGS) | 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


191675_c2_150 


| 1037 


| 2957 


1 aw | 


p56 | 


698 


9.5e-S9 | 



Protein name 



Locus Name 



5' adenylyl suit ate APS reductase 



[gpT 



AE170343 



Acc# 
AF170343 



Description 



Buricnoideria cepacia 5' aaenyiyisuitate APS reductase (cysH) gene, complete 
cds; and ATP sulfurylase small subunit (cysD) gene, partial cds . 



ORF Name 
|19821942_c3_205 

Protein name 

Description 

NO-HIT 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



64 



] D 



Locus Name 



Probability 



ACC# 



288 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


20203302 c3 18'5™"" " 


1039 


2959 


226 


681 |141 


[4 . Oe 


-08 


Protein name 








Locus Name 




Acc# 


DnrE protein 








gp:PST1317i6 




AJ131716 


Description 














Pseuaomonas stutzeri dnrE 


gene and 


ORF235 (partial) . 








ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|20570385_c2_159 


| |1040 


| |2960 


i r 


| 552 






Protein name 








Locus Name 




Acc# 


Description 














MO-HIT 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


21494010_c2_174 


|1041 


| |2961 


| 455 


| 1368 |1377 


|l.le. 


-140 


Protein name 








Locus Name 




ACC# 


nitric oxide reductase 






|gp:AF002217 




AF002217 


Description 














Ral stoma eutropha 
complete cds . 


megaplasmict pHGl nitric 


oxide reductase 


(norB) gene, 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


21676712_cl_120 


1042 


| 2962 


| 130 


| |393 | 395 


|1.2e- 


1 


Protein name 








Locus Name 




Acc# 


sultate adenylyltransterase sufcunit CysN 


gp:AF130466 




AF130466 



Description 



Campylobacter jejuni peptide cnain release tactor 2 (prtB) gene, partial cds; 
alpha-2, 3-sialyltransf erase (cst-I) and sulfateadenylyltransf erase subunit 
CysD (cysD) genes, complete cds; andsulfate adenylyltransf erase subunit CysN 
(cysN) gene, partial cds. 



289 



ORF Name 



122078812 c2 154 



Protein name 



Description 



NT ID 



AAID 



TUZT 



][ 



NT AA 
Length Length 
12184 



727 



Score 



Locus Name 



Probability 

|3.8e-158 

Acc# 



sp:RECG_ECOLI 



P24230:P76 
721 



ATP - DE PENDENT DNA HELICASE RECG, 


ORF Name 


NT ID AAID 


NT 
Length 


*w _ 
. — . , Score 
Length 


Probability 


22350925_ci_115 


| 1044 | 2954 


724 


12175 1 12193 1 
1 III 


|3.5e-227 
1 


Protein name 






Locus Name 


Acc# 








sp : r Avjo xriDiirK 


DO Q7Q"} 


Description 










ORF Name 


NT ID AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


|23555252_c3_203 


| |1045 | |2955 


500 


1803 |709 | 


[6 . Oe-85 


Protein name 






Locus Name 


Acc# 


glutamate syntnase 


(terreaoxm) nomolog yerD 


pir:C59794 


C69794 


Description 










ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|23593830_c3_199 


1045 | 2955 


|340 


|1023 | |452 | 


|9.7e-44 | 


Protein name 






Locus Name 


Acc# 



Description 



sp:YWBM_BACSU 



P39596 



HYPOTHETICAL 42 


8.KD PROTEIN IN tlPR-GALK INTERGENIC REGION 




ORF Name 


NT ID 


NT 

AAID — _ 
Length 


AA 
Length 


Probability 


|23551900_t2_48 


1047 


2957 328 | 


|987 | [534 | 


|5.8e-52 


Protein name 






Locus Name 


Acc# 



sp:YOHI_HAEIN 



P44606 



Description 
HYPOTHETICAL PROTEIN HI 02 70 



290 



ORF Name 
|23fl62S76_ci_124 
Protein name 



NT ID AAID 



NT AA 
Length Length 
T53 



probable antibiotic resistance protein mtrc 



Description 



Score Probability 
[TFT 



4.5e-l4 



Locus Name 



bir:S42418 



Acc# 

S42418 :S40 
252 



ORF Name 



NT ID AAID 



NT AA Score 
Length Length 



Probability 



|24042500_c3_183 


1049 | |29S9 


| 313 


|942 | |608 | 


|3.3e-59 | 


Protein name 








Locus Name 


Acc# 










sp:OTSN_MYCTU 


Q10600 


Description 












SULFURYLASE) 




ORF Name 


NT ID AAID 


NT 
Length 


AA 

T — . . Score 
Length 


Probability 


|243022S0_c3_i79 


| |1050 2970 


i r i 


|1233 1331 


|8.0e-136 | 


Protein name 








Locus Name 


Acc# 


| 3-oxoacyl-CoA 


tnioiase 




gp:AF150672 


1 AF150672 


Description 












| Pseudomonas putida 3-oxoacyi-CoA tnioiase (tadA) gene, completecds . 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|25584438_c3_189 


| 1051 | |2971 


1 l i4J 1 


h 32 1 1 174 1 


|7.6e-12 | 


Protein name 








Locus Name 


Acc# 


| ceoB 






j 


gp:BOT97042 


| U97042 


Description 












1 Burkholderia cepacia CeoA (ceoA) 


and CeoB (ceoB) genes, completecds. 




ORF Name 


NTID AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


25984558_c3_188 


1052 | |2972 


134 


405 133 


i.8e-07 


Protein name 








Locus Name 


ACC# 


| acritlavm resistance protein D 


(acrD) RP170 


pir :F71727 


F71727 



Description 



291 



ORF Name 
|26741SB6_cl_13S 



NT ID AAID 



297T 



NT AA 
Length Length 




5TT 



Score 



Probability 
0.0011 



Protein name 



Description 



Locus Name 



Sp:HA34_BRELC 



ACC# 
Q99074 



HAM34 PROTEIN 


ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


3245540_c3_190 |1054 | |2974 | 


425 


|1278 | |265 | 


|9.4e-20 | 


Protein name 




Locus Name 


Acc# 


probable cation eltlux system protein 


pir:E71874 


E71874 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


3375052_ci_i25 | |1055 | |2975 | 


|144 | 


|435 |177 | 


|3.6e-i2 | 


Protein name 




Locus Name 


ACC# 


provable ettlux transporter 




pir:H7i$18 


H71918 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


34027092_c2_175 | |10B5 | [2375 | 


,4 | 


222 p | 


|0.025 


Protein name 




Locus Name 


Acc# 


tonoplast intrinsic protein 


gp:AF03706i 


AF037061 


Description 


zea mays tonoplast intrinsic protein 


(ZmTIPl) 


mRNA, complete cds. j 



ORF Name 
p5i97125_t3_79 



NTID 



AAID 



NT AA 

t — 4-u t — *-u Score 
Length Length 



] [1057 | [2977 | |179 | |540 | |155 | 



Probability 
3.3e-il 



Protein name 



Locus Name 



TatB protein 



E 



ECO5830 



ACC# 
AJ005830 



Description 
Escherichia coll tatABCD operon. 



292 



ORF Name 



135339135 12 61 



NT ID 



AAID 



][ 



NT 
Length 
2^3 



AA 
Length 
— 



Score 



Probability 
|2.9e-35 



Protein name 



Locus Name 



|sp:LEP3_AERHY 



Acc# 
P45794 



Description 

T YPE 4 PR E P I LIN-L I KE PROTEIN SPECIFIC L E ADER PEPT I DASE, 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


35395926_rl_6 


| |1059 


| 2979 


1 b11 1 


[1535 | |325 | 


|2.2e 


-3b 


Protein name 








Locus Name 




Acc# 


probable helicase 


bir:T40239 


T40239 


Description 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


3S047308_t2_50 


| |1050 


| |2980 


1 10b 1 


|31S | 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT j 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|3939838_c2_173 


|106I 


2981 




201 |189 | 


|8.2e- 


-15 


Protein name 








Locus Name 




Acc# 



Description 



sp:RL35_P3ESY 



P52830 



5 OS RIBOSOMAL 


PROTEIN L35 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


394767S_t2_46 


| 1062 


j 2982 


i jm i 


r 3 i 


375 | 


I.6e-34 



Protein name 

Description 
HISTIDINE UTILISATION REPRE^OR 



Locus Name 



sp : HU T C_KLEAE 



Acc# 
P12380 



293 



ORF Name 



NT ID 



AAID 



ci 177 



2983 



NT AA 
Length Length 
WZ1 1 11326 



Score 



Probability 
7.5e-140 



Protein name 



Locus Name 



sp:CLPX_HAEIN 



Acc# 
P44838 



Description 

A T P - BE PEND E N T CLP PROTKASK ATP-BlMmtlO 5TJBTOTTT CLPX 



ORF Name 



14328428 cl 119 



][ 



NT ID 



AAID 



NT AA 

— , — , Score 

Length Length 

nmj 1 mr 



1 1 



sir 



Probability 
|2.6e-9I 



Protein name 



Description 



Locus Name 



sp:CYSD_MYOTJ 



Acc# 
Q10599 



SULFURYLASE) 


ORF Name NT ID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|4<:85953_c3_206 | 1065 


| |2985 


| 121 


|366 | 469 


1.8e-44 | 


Protein name 






Locus Name 


Acc# 


nbosomal protein L2 0 






pir :R5EC20 


1 D64930:S08 


Description 








608:A02806 
:I41282 


ORF Name NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|5120443_t3_94 | |1066 


| |2986 


214 


P b | Mi j | 


3.0e-33 | 



Protein name 



Description 



Locus Name 



sp:YACE_VmVU 



Acc# 
Q56741 



HYPOTHETICAL 22 


5 KD PROTEIN IN WPD 3'RUUION (ORPX) 






ORF Name 


NT AA 

NT ID AAID _ , _ 

Length Length 


Score 


Probability 


5B90&43_t2_73 


| |I067 | |2987 | |380 | |1143 | 


pi | 


|7.7e-07 



Protein name 



Locus Name 



etna J protein homolog 



pir:S34632 



ACC# 
S34632 



Description 



294 



ORF Name 



7277 t3 93 



NT ID 
] [1058 



AAID 



NT AA 
Length Length 
WT1 1 \TTT7 



— _ Score 



Probability 
3.8e-65 



Protein name 



Locus Name 



pilus assembly protein PilC 



gp:AF038655 



Acc# 
AF038655 



Description 



Legionella pneumopmia pilus assembly protein pub (pilB) , piiusassemfciy 
protein PilC (pile) , and type IV prepilin-like proteinspecif ic leader 
peptidase PilD (pilD) genes, complete cds . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|978S17_c2_16G 


| [1069 


| |2989 


1 1 4 " i 


|1302 | |615 | 


|5.8e-I12 1 


Protein name 








Locus Name 


Acc# 










sp:ULTS_HAEIN 


j P45240 


Description 












SODIUM/ GLUTAMATE 


SYMPORT 


CARRIER 


PROTEIN (GLUTAMATE PERMEASE ) 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


I054771i_c2_8 


| [1070 


|2990 


275 


|828 | 933 


|1.2e-93 



Protein name 



Description 



Locus Name 



sp:ABC_HAEINf 



Acc# 
P44785 



ATP-BIMDINO 


PROTEIN ABC 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Probability 


188807_C2_1U 


| (1071 


| |2991 


1 1 118 i 


|354 | |320 | 




Protein name 








Locus Name 


ACC# 



Bp : E>L&A_PASHA 



Description 

OUTER MEMBRANti LIPOPROTEIN 1 PRECURSOR (PLPi) 



Q08868:Q07 
363 



295 



ORF Name 



NTID 



AAID 



NT 



AA 



Length Length 



Score Probability 



195S9635_cl_7 


1072 2992 


99 


300 | |190 | 


6. Be 


-15 


Protein name 






Locus Name 




Acc# 


ORF120 






Igp : EC0RRNHK12 




D15061 


Description 




E.coli genomic DNA, 


5'tlanicing region or rrnH gene. 




1 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|24353193_ti_i | 


1073 | |2993 


i 9t i 


|288 | |113 | 


|9.3e- 


-07 


Protein name 






Locus Name 




Acc# 


nypotneticai protein PH0133 




pir:C71234 




C71234 


Description 












ORF Name 


NTID AAID 


NT 
Length 


AA 

— Score 
Length -- - - 


Probability 


|35582056_c3_ii 


|1074 | 2994 


i 240 i 


723 531 | 


1.2e- 


-61 



Protein name 



Locus Name 



|sp:YAEE_HAEIN 



Acc# 
P46492 



Description 

HYPO T HE T ICAL AUC TRANSPORTER PERMEASE PROTEIN H10620.1 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


|4144442_r2j5 


| |1075 


| |299B 


1 1 131 i 


p5S | 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 

Length 


AA 

„ — . , Score 
Length 


Probability 


16924127_ll_i 


| 1076 


| |299<=> 


| 367 


1104 | |137 | 


|7.2e-06 


Protein name 








Locus Name 


Acc# 



E 



PSENOSA 



M60717 



Description 

P.stutzeri NosA protein (nosA) gene, complete cds ', 



296 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — , Score 
Length 


Probability 


5i09792_±2_3 


| 1077 


2997 


,m | 


p&9 | 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


UKr warne 


JN 1 ± Lf 




NT 
Length 


AA 

„ t , Score 
Length 


Probability 


|15681688_ci_3 


| |107« 


2998 




|m | |bb | 


10 . 044 


Protein name 








Locus Name 




Acc# 










sp:Mffl_HELPY 




P56120 


Description 














RIBOUUC'LEASE H, 


(RNASE HJ 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


24039i43_ii_I 


| |1079 


| |2999 


1 \ i4b 1 


|1038 537 | 


p. 8. 


-62 


Protein name 








Locus Name 




Acc# 
















ornitnine decarboxylase 






pir :D72200 




D72200 


Description 














ORF Name 


NT ID 


AAID 


NT 
Length 


, — . , Score 
Length 


Probability 


29337B40_t2_2 


| jioao 


| pooo 


150 


450 |95 | 


[0.014 


Protein name 








Locus Name 




Acc# 


AvtA 


|gp:AP014804 


AF014804 



Description 



Neisseria meningitidis PglB (pgiB) , 
genes, complete cds. 



PglC (pglC) ; PglD IpglD) , andAvtA (avtA) 



ORF Name 



11900461 t2 7 



NTID AAID 
"| [1081 



NT AA 
— — Score 
Length Length 



TOUT" 



][ 



Protein name 
Description 
MO-HIT 



Locus Name 



Probability 



ACC# 



297 



ORF Name 



14573552 EI 2 



Protein name 



Description 



NTID 



TUZT 



AAID 



TOUT- 



NT AA 
Length Length 
JZI 1 11026 



Score 



Locus Name 



sp:TRMU_ECOLI 



Probability 
ll.le-108 



Acc# 

P25745:P75 
964 



(EC 2.1.1.61) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


30270465_rlJ3 


|1083 


| |3003 


1 ^ 1 


|750 | |185 | 


|5.7e-14 | 


Protein name 








Locus Name 


Acc# 



sp:WAD_BACSU 



P37520 



Description 

HYPO T HE T ICAL 37.7 KB PROTEIN IN RP5F-3PO0J INTERGUNIC R^IOM 



ORF Name 



NTID AAID 



6365936 12 6 



AA 

— Score 
Length Length 

TTTT 



NT 
n 

TOT 



TIT 



Probability 
6.5e-13 



Protein name 



Description 



Locus Name 



gp : ECPURB 



ACC# 
X59307 



E.coli ORF-15, 


ORF- 23, purB and pnoP (5' end) 


genes . 




ORF Name 


NT 

NTID AAID , 

Length 


AA 

. — . , Score 
Length 


Probability 


6537957_ci_ii 


| |1085 | 3005 | |200 | 


|603 | |454 | 


6.8e-43 | 


Protein name 




Locus Name 


Acc# 



sp:YGBB_ECOLI 



P36663 



Description 

HYPOTH E TICAL 16 . 9 KB PROT E IN IN 5URE-CYSC INTERGEN I C REG I ON (OR F 0) 



ORF Name 



NTID 



AAID 



173187 t3 5 



NT AA 
Length Length 

I5B 



Score Probability 



E?0 



Protein name 
Description 

MO-HIT 



Locus Name 



Acc# 



298 



ORF Name 


NTID 


AAID 


NT 
Length 


— , Score 
Length 


Probability 


14407527 + 2 2 


[TO 8 7 


1 13007 1 

1 r uu/ 1 


1355 
I 


1101 |785 ] 


|4 . 5e 


-78 


Protein name 








Locus Name 




ACC# 










sp:LCFA_ECOLI 




P29212 


Description 
















SYNTHETASE} 




ORF Name 


NTID 


AAID 


NT 
Length 


— Score 

Length 
—> 


Probability 


7285152_rl_i 


| 1088 


| 3008 | 


P 1 


|207 | 






Protein name 








Locus Name 




Acc# 


Description 














MO-HIT 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


160S6b63_c2_63 


| 1089 


| 3009 j 


141 


425 190 1 
1 1 


|0.020 


Protein name 








Locus Name 




ACC# 


nypotneticai wttw protein 






pir :T41^b2 




T41252 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, , Score 
Length 


Probability 


1985087_t3_30 


| 1090 


|3010 


412 | 


11239 1 1316 
1 1 


3.le 


-134 | 


Protein name 








Locus Name 




Acc# 










sp:SERA_HAEIN 




P43885 


Description 














D- 3 - PHOS PHOGLyCERATe DEHYDROGENASE , 


(PGDH) 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


2072762Jbl_10 


| 1091 


ipoii | 


1220 I 


3553 | 524 | 


|4 .4e 


-93 


Protein name 








Locus Name 




Acc# 


cnromosome segregation smc 
protein :minichromosome stabilizing protein 
SMC 


pir:G6970U 










G69708:JC4 
819:PC4029 



Description 



299 



ORF Name 



NT ID 



AAID 



21531252 t3 36 



TUTT 



NT AA 

— , , — A , Score 

Length Length 

TCT3 1 FIT 



[722- 



Probability 
2.7e-7i 



Protein name 



Locus Name 



translation elongation factor EF-Ts 



pir :EFECS 



Description 



ORF Name 



ACC# 

A03525 :A45 
269:A32881 
:S45235:B6 



123405 tl 11 



NT ID 
] [1093 



AAID 



NT AA 
Length Length 
|279 



— . , Score 



EZZI [ 



Probability 
|6.Ie-90 



Protein name 



Description 



Locus Name 



sp:RS2_5PIPL 



Acc# 
P34831 



305 RIBOSOMAL PROTEIN 52 


ORF Name NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23437562_ti_7 | |1094 |3014 


253 


|752 | |186 | 


|3.2e-i8 | 


Protein name 




Locus Name 


Acc# 


nypotnetical protein HP0862 




pir:F64627 


F64627 


Description 








ORF Name NT ID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|23613510_c2_70 | |i095 | [301S 


| 374 


|1I25 | |700 | 


p.8e-69 | 



Protein name 



Description 



Locus Name 



sp:YCF0_EC0LI 



ACC# 
P75949 



HYPOTHETICAL 37.6 


KD PROTEIN IN t'HUU-NDH INTERGENIC 


REGION 




ORF Name 


NT AA 

NT ID AAID „ . , _ — 

Length Length 


Score 


Probability 


|2386i6B6_t3_3B 


| |1096 |30i6 |180 |543 


1 52S 1 


i.6e-50 



Protein name 



Locus Name 



invasion protein homolog 



] bp:AFlI5285 



Acc# 
AF116285 



Description 



Pseudomonas aeruginosa invasion protein nomolog 
andphosphoenolpyruvate-protein phosphotransferase PtsP (ptsP) 
cds . 



genes, complete 



300 



ORF Name 



NT ID 



AAID 



12460181 tl 1 



T7T9T 



JUTT 



NT AA 
Length Length 
W51 1 12964 



Score 



Probability 
T5TT3 



Protein name 



Locus Name 
|sp:RWfe_&SB&U 



Description 

| BETA CHAIN) (RNA POLYMERASE BETA 5UBUNTTT 



Acc# 
P19175 



ORF Name 



NT ID 



AAID 



125401687 c3 88 



] 



][ 



NT AA 
Length Length 

] EE ~ " ' 



Score 



] EZO EZD 



Protein name 



Locus Name 



gp:PAU89892 



Probability 
|3.3e-34 

Acc# 
U89892 



Description 

I Pseudomonas aeruginosa virulence factor regulator (vtr) gene, partial cas ! 



ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|260<54760_c3_93 


|1093 | 3019 




pei | 




Protein name 






Locus Name 


Acc# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|2989070i_t3_31 


| |1100 | p020 


1 46y 1 


|i4io | jifeva | 


|1.3e-172 | 


Protein name 






Locus Name 


Acc# 



Description 



|sp:GSHR_HAEIN 



P43783 



GLUTATHIONE REDUCTASE, (GR) (GRAyt!) 


ORF Name NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


32287557_r2_16 | 1101 | |302l 


|1422 


4269 | |4932 | 


|0.0 




Protein name 




Locus Name 




Acc# 


99% identity over 1407 amino acids 
coli 


with E. 


|gp:5TYSTMPl 


AF170176 











Description 

I salmonella typhimurium tragment STMF1 . 



301 



ORF Name 



NTID AAID 



AA 

— Score 
Length Length 



3907166 tl 6 



TTUT 



NT 

|n 



Probability 



Protein name 



[930 | |117 | |1.6e-06 

Locus Name 



j putative oiotm protein nga"se~ 



bp:AF0i64Sl 



Acc# 
AF016461 



Description 



3oraeteiia pertussis putative biotin protein ligase (JDirA) gene, complete cds 
and Bvg accessory factor (baf) gene, partial cds . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|4507i38_ti_5 


1103 


p023 


| 727 j 


2184 | 


|73S 


|2.5e-88 | 



Protein name 

Description 
PROTEIN) 



Locus Name 



sp:PRC_EC0LI 



ACC# 
P23865 



ORF Name 
|4$5950i_cprr 



NTID AAID 



TTuT- 



][ 



NT AA 
Length Length 



— Score Probability 



Protein name 

Description 

pro-HlT 



Locus Name 



Acc# 



ORF Name 
|B119452_cr?T 



NTID AAID 



XT 



P5 1 [3025 | 



NT AA 

— — Score 

Length Length 

7 XZT 



] f"~i 



Protein name 



Description 



Locus Name 



sp:Y902_HAEIM 



Probability 
|5.9e-37 



Acc# 
P44070 



HYPOTHETICAL PROTEIN H10902 


ORF Name NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|11955251_c2_2G | iiOS | |3026 


|915 


2748 


|2735 | 


|1.3e-284 | 



Protein name 



Locus Name 



sp:SYA_ECOLI 



Description 

ALANYL-TRNA SYNTHETASE, I ALANINE- -TRNA LIGASE) (ALARS) 



Acc# 

P00957 :P78 
279 



302 



ORF Name 



NT ID 



AAID 



116523292 c2 25 



TTOT 



NT AA 
Length Length 
^ 1 11467 



Score 



Probability 
|2.4e-157 



Protein name 



Description 



Locus Name 



sp:PUR8_HAEIN 



Acc# 
P44797 



| ADENYLOSUCCINATE 


LYASE , 


(ADENYLOSUCCINASEJ 


{ASLJ 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


|23475417_t3_18 


| 1108 


|3028 


73 


|222 | 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


_ — . , Score 
Length 


Probability 


24308500_cl_22 


1 1109 


3029 


|155 


|468 | 282 | 


|i.2e-24 



Protein name 



Locus Name 



erytnroid. ait terentiat ion- related t actor 2 



gp:AE040248 



Acc# 
AF040248 



Description 

Homo sapiens erythroicl ditterentiation-related tactor 2 mRNA,partial eels . 



ORF Name 
|31444625_c3_34 
Protein name 

Description 



NTID AAID 



Score Probability 



TTTu" 



][ 



NT AA 
Length Length 
452 | [1359 | |646 | | 3.1e-63 

Locus Name Acc# 



sp:YCLF_BACSU 



P94408 



HYPOTHETICAL 53.3 


KD PROTEIN IN SEP- 


-GERKA INTERGENIC REGION 




ORF Name 


NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


33632752_c2_27 


| 1111 |3031 


285 |855 | 679 | 


|9.Be-67 | 



Protein name 



Locus Name 



aspartate Kinase, II " " 
precursor : lysine-sensitive aspartokinase II 



pir :A48946 



Description 



Acc# 

A48946:B48 
946:C48946 



303 



ORF Name 



33829043 c3 33 



Protein name 
Description 



NT ID 



AAID 



TTTT 



NT AA 
Length Length 

J EE 



— Score Probability 



J 



Locus Name 



Acc# 



NO-HIT 



ORF Name 



I 44002U3 cl 23 



Protein name 
Description 



NT ID AAID 



NT AA 
— — Score 
Length Length 



][ 



TTTT" 



] I 3033 | P 5 | | 2Sfl | 



Locus Name 



Probability 



Acc# 



MO-HIT 



ORF Name 
|6852262_ci_2i 

Protein name 
Description 

MO-HIT 



NTID AAID 
]E TT 0 



NT AA 
— , — A _, S core 
Length Length 



Locus Name 



Probability 



Acc# 



ORF Name 



9933552 tl 3 



Protein name 



Description 



NTID AAID 
TTT5 1 ITuTT 



NT 



AA 



Length Length 
18 2 | |249 



Score Probability 



Locus Name 



Acc# 



[NO-HIT 



ORF Name 



99750b2 c2 24 



Protein name 
Description 



NTID 



AAID 



NT AA 
— , — , Score 
Length Length 



][ 



[TTTT" 



132 



Locus Name 



Probability 



ACC# 



WO-HIT 



ORF Name 



1036637 c3 276 



Protein name 



Description 



NTID AAID 

IEEE 



NT AA 
— — Score 
Length Length 



Locus Name 



Probability 



Acc# 



NO-HIT 



304 



ORF Name 



Protein name 



NT ID AAID 



TTUT 



][ 



1TJTS" 



NT 



AA 

— Score 
Length Length 



Probability 
1526 | |i.7e-lS6 — 



Locus Name 



lactate denydrogenase 



:NMU58911 



Acc# 
U58911 



Description 



Neisseria meningitiais lactate denydrogenase UidA) , HI0379 nomoioggenes , 
complete cds, HI1054 homolog gene, partial cds . 



ORF Name 



NTID AAID 



|10S47SBfl_ci_i80 ~] 
Protein name 



TTTT 



I FH f 



NT AA 
Length Length 

ED 



T7F 



Score Probability 



8 . le-08 



Locus Name 



hypothetical protein APE1165 



pir:H72586 



Acc# 
H72586 



Description 
ORF Name 



NTID AAID 



111711 c3 255 



TT27T 



NT 

in 



AA 

— Score 
Length Length 



Probability 
1 . 3e-46 



Protein name 



Locus Name 



HisX 



E 



:AF010189 



ACC# 
AF010189 



Description 



Pseuaomonas stutzeri Htic mtic) gene, partial cds; Hisx (nisx)gene, 
complete cds; and PurA (purA) gene, partial cds. 



ORF Name 


NT 

NTID AAID „ , , 

Length 


AA 
Length 


Score 


Probability 


11991552J:3_iu8 


1121 | |3041 285 | 




514 


7.6e-60 


Protein name 




Locus 


Name 


Acc# 






sp:TRPC_PSEPU 


P20578 


Description 










INDOLE - 3 - GLYCEROL 


PHOSPHATE SYNTHASE, (IGPS) 








ORF Name 


NT 

NTID AAID _ . , 

Length 


Length 


Score 


Probability 


12222077_±2_53 


| |1122 | |3042 | |158 


i 547 i 


i 4 " i 


|1.8e-44 



Protein name 
Description 

I RECOMBINA T ION PROTE I N R E CR 



Locus Name 



sp:RECR_HAEIN 



ACC# 
P44712 



305 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

r — , i Score 
Length 


Probability 


|I36302_cI_i67 


i 


|p04i | 


1154 


465 |95 | 


10.0012 


Protein name 










Locus Name 




Acc# 


1 nypotnetical protein siiib/b 


pir :S7464S? 


S74649 


Description 
















ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


13S787S3_ci_ia2 


| 1124 


| |3044 | 


|223 


|672 | |194 | 


2.4e 


1 


Protein name 










Locus Name 




Acc# 


nypotnetical protein RP471 






pir:D71706 




D71706 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


14094452_c3_at7 


i i mt 


|3045 


60 


1183 1 






Protein name 










Locus Name 




ACC# 


Description 
















NO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


14103402_tl_7 


| 1126 


| |3046 


285 


|861 | 255 


7r3e 


-23 


Protein name 










Locus Name 




Acc# 












sp:YPUfl_BACSU 




P35154 


Description 














/ 


HYPOTHETICAL 29.6 


HKD PROTETNIN RIBT 


^DACB INTERGENIC REGION 


(ORFX7) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|14273586_t2_b7 


|1127 


| 3047 | 


|451 


|1356 | |150fl | 


|1.4e 


-154 | 


Protein name 










Locus Name 




Acc# 












sp:ACCC_PSEAE 


P37798 



Description 



CARBOXYLASE, } {ACCJ 



306 



ORF Name NT ID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


114572132 f3 135 1128 


3048 


238 


717 736 


8.9e 


-73 | 


Protein name 








Locus Name 




Acc# 










sp : END3_HAEIN 


P44319 


Description 














LYASE) 




ORF Name NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|14875305__t3_113 | 


| 3049 


1 1" 9 i 


|420 | |141 | 


|1.2e 


-09 


Protein name 








Locus Name 




Acc# 


Ribonuciease D (EC 3.1.13.- 


■ ) 




|gp:D90825 




Description 












D90825 :AB0 
01340 


E.coli genomic DNA, Konara 


clone #334(40.6-41.0 


min . ) . 








ORF Name NTID 


AAID 


XTT" 
N 1 

Length 


AA 

_ — . , Score 
Length 


Probability 


15507827_c3_256 | 1130 


| 3050 


519 | 


|1560 | |1030 | 


|6 .3e 


-104 j 


Protein name 








Locus Name 




ACC# 










sp:NADB_PSEAE 




Description 












Q51363 :Q51 
412 


L- ASPARTATE OXIDASE, ( QUINOLINATE 


SYNTHETASE 












ORF Name NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|15510254_ci_l!>4 | 1131 


| 3051 


| ioa | 


|927 | |1063 | 


|2 . Oe 


-107 


Protein name 








Locus Name 




ACC# 










sp:RFl_ECOLI 




Description 












P07011 :P77 
340 


PEPTIDE CHAIN RELEASE FACTOR 1 (RF 


-1) 












ORF Name NTID 


AAID 


NT 
Length 


AA 
Length 


Probability 


16829i77_ci_172 | 1132 


| 3052 


9b | 


|288 | |83 | 


|0.041 | 


Protein name 








Locus Name 




ACC# 


F1N21.17 




gp:AC002130 


AC002130 



Description 



The sequence ot BAC F1N21 trom Arabidopsis thaliana chromosome l, complete 
sequence . 



307 



ORF Name 



NT ID AAID 



119562800 12 52 



NT 



AA 

— Score 
Length Length 



P5T 



T7T 



Probability 
|i.0e-23 



Protein name 



Description 



Locus Name 



sp:YBAB_HAEIN 



Acc# 
P44711 



HYPOTHETICAL 


PROTEIN HT0442 








ORF Name 


NT ID AAID 


NT 
Length 


_ — . , Score 
Length 


Probability 


[20312551_li_8 


| |1134 | |3054 


[202 | 


pi) | P 43 | 


|i.6e-20 | 


Protein name 






Locus Name 


Acc# 



Description 



sp:YPUH_BAC3U 



P35155 



HYPOTHETICAL 22 . 0 KD PROTEIN IN RIBT-DACB INTERGENIC REGION (0RFX8) 


NT AA 

ORF Name NTID AAID _ — . , _ 

Length Length 


Score Probability 


|20573252_c2_229 1135 |3055 | 205 |618 


|309 | |i.6e-27 | 



Protein name 



Description 



Locus Name 



sp:YDJA_ECOLI 



ACC# 
P24250 



HYPOTHETICAL 20.1 


KD PROTEIN IN SELD-5PPA INTERGENIC REGION 


(ORt'183) 


ORF Name 


NT AA 
NTID AAID . — . , _ — . , Score 
Length Length 


Probability 


|216G7027_t2_89 


| 1135 |3056 | 832 |2499 | |2093 | 


|1.4e-21S 



Protein name 



Description 



Locus Name 



sp:LON_ERWAM 



Acc# 
P46067 



1 ATP-DEPENEENT PROTEASE LA, 


ORF Name NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|2158i503_ri_9 | 1137 


| |3057 


i p» i 


990 |758 | 


|4.2e-75 


Protein name 






Locus Name 


ACC# 



sp:YCIL_HAEIN 



P45104 



Description 
HYPOTHETICAL PROTEIN HIii.99 



308 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 








[197 


|594 | 284 | 


|7. le 


1 


Protein name 










Locus Name 




Acc# 












sp:YBEY_ECOLI 


P77385 


Description 
















HYPOTHETICAL 17 


5 KD PROTEIN IN CUTE 


-ASNB INTERGENIC REGION 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|22128380_il_4I 


| |1139 


II 3059 1 


r i 


I 264 1 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


22384<535_tl_31 


| 1140 


| |3050 


190 | 


pa | p 4 | 


K3.7e- 


-by | 


Protein name 










Locus Name 




ACC# 


HemO 




gp:AF13369£ 


AF133695 



Description 



Neisseria memngitiais Hemu (nemo) gene, complete cds; ana HmbR { nmbR ) gene, 
partial cds . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


23444377_ci_199 


1 1 1141 


|3061 


492 | 


|1479 | |2354 | 


|2.7e 


-245 


Protein name 










Locus Name 




Acc# 


outer membrane protein E 




gp:MB00MPE 


L31788 


Description 














Moraxella catarrnalis outer membrane protein 


E gene, complete 


cds . 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|23470u_c3_268 


|1142 


| |3062 


1 P 44 1 


|1035 J1002 | 


5.8e- 


-101 


Protein name 










Locus Name 




ACC# 


unknown 


gp:AFioyi3i 


AF109131 



Description 



Sinorhizobium meiiloti nomogentisate dioxygenase (hmgA) 
andmaleylacetoacetate isomerase (maiA) genes, complete cds; andunknown gene. 



309 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


123620253 c3 271 


1 11143 
1 1 


| 3063 


267 


804 534 | 


2.3e-5i 


Protein name 










Locus Name 


Acc# 












sp : KDSB_ECOLI 


P04951 


Description 














SYNTHETASE) (CMP- 2 


-KET0-3 


- DEOXYOCTULOSON 1C 


ACID 


SYNTHETASE) 


ICRS) | 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

Length Sc ° re 


Probability 


123551513 tl 43 


1 11144 


| (3064 


|259 


|780 | |589 | 


|3.4e-57 | 


Protein name 










Locus Name 


ACC# 












sp:NADC_RHORU 


P77938 


Description 














) (QAPRTASE) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|24031586_ti_5 




| p«s 


| 383 


| |1152 | |976 | 


3.3e-58 | 


Protein name 










Locus Name 


Acc# 



Description 



sp:TRPD_ACICA 



P00500 



ANTHRANILATE PH05PH0RIB05YLTRANSFERASU, | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|24397555_t3_140 




| |3066 


1 P 1 * 1 


|660 | |165 | 


1.2e 


1 


Protein name 










Locus Name 




ACC# 


probable corA prot 


em 








pir :F70952 


F70952 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


24415782_r3_106 


1147 


3067 




| |87 | 


J0.0075 | 


Protein name 










Locus Name 




Acc# 


UUP protein 




gp:ECUUP 


Y09439 



Description 



E.coii uup gene, partial. 



310 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Probability 


24721962_c3_282 


1148 


3068 


77 


234 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT | 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|24801937_c2_236 


| |1149 


| |3069 


ih j i 


r i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|26755325_riJ. 


| |1150 


3070 


1 * 14 1 


|945 | |102i| 


|5.5e-103 | 


Protein name 








Locus Name 


Acc# 










sp:OTCA_PaE^H 


Q02047 


Description 












(EC 2.1.3.3) (OTCASE) | 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


26757790_c3_255 


1151 


3071 


| 376 


1131 |10S5| 


|1.2e-107 | 


Protein name 








Locus Name 


Acc# 



Description 



sp:NADA_ECOLI 



P11458 :P77 
373 



QUIMOLIMATE 


SYNTHETASE A 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , i Score 
Length ■■■ ■ ■ 


Probability 


2742036_tl_3 


| |llb2 


| 3072 


144 


|435 | 426 


6.3e-40 


Protein name 








Locus Name 


Acc# 



Description 
T5ECAEBSXYEASET 



'sp:PAMJ_BACSU 



P52999 



311 



ORF Name 



NT ID AAID 



129301591 tl 14 



TT5T 



TTT7T 



NT AA 
Length Length 
TZZ — 



Score 



ITT 



TT0~ 



Protein name 



Description 



Locus Name 



l sp:RND_HAEIM 



Probability 
|4.7e-07 

Acc# 
I P44442 



RIBONUCLEA5E D, (RNASE D) 1 


ORF Name NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|29355287_t2_99 | |1154 


| 3074 


i r 4 


|225 | |79 | 


10.013 


Protein name 






Locus Name 


Acc# 


MutT/nudix tamily protein 






pir :A75550 


A75550 


Description 










ORF Name NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|29484525_cl_173 | |1155 


| |3075 


i p ss i 


1071 |555 | 


|2.4e-57 


Protein name 






Locus Name 


ACC# 



Description 



sp:Yei2_E>SEKJ 



P31857 



HYPOTHETICAL 32 


4 KD PROTEIN IN GIDB 


-UNCI 1NTERGEN1C REGION 




i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|2953i392_ti_40 


1 I 1156 


||307t | 


|i,4 | 


|375 | |173 | 


|1.5e 


-11 


Protein name 










Locus Name 




Acc# 


lustnn A 




pir :T08852 


T08852 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


29572155_r2_49 


1 I 1157 


|3077 | 


p" i 


|1032 |406 | 


8.3e 


1 


Protein name 










Locus Name 




Acc# 



Description 
PROTEIN B) 



sp:HTRB_HAEIM 



P45239:Q48 
045 



312 



ORF Name 


NTID AAID 


NT 
Length 


— Score 
Length 


Probability 


23TT9053I f 2 56 


" 1158 | 3078 


157 1 
1 


474 259 


3.1e 


-22 | 


Protein name 






Locus Name 




ACC# 








sp:BCCP_HAEIN 




P43874 


Description 












BIOTIN CARBOXYL 


CARRIER PROTEIN OP 


ACETYL -COA CARBOXYLASE (BCCP) 




ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


130204837 c2 230 


1 fTT59 13075 


|447 | 


|1344 |628 | 


|2 . 5e 


-61 


Protein name 






Locus Name 




Acc# 








sp:GPr>A_ECOLI 




P37606 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — Score 
Length 


Probability 


30330056_12_48 


jiiSO 3080 


233 


|702 | |506 | 


|2.1e- 


-48 


Protein name 






Locus Name 




Acc# 


putative ATP-JDinamg protein 




gp:NME242841 




AJ242841 


Description 




Neisseria meningitidis dna tor opcA region, 


strain Z2491. 






ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


32878_c2_250 


|ii5i |308i 


80 


P 4i | r | 


|0. 00033 | 


Protein name 






Locus Name 




Acc# 








sp:SLYX_ECOLI 




P30857 


Description 












SLYX PROTEIN 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


341803i7_ti_22 


1162 |3082 


115 


|348 | 420 


|2.7e- 


1 


Protein name 






Locus Name 




Acc# 








sp:YCHF_HAEIN 




P44681 


Description 













313 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


3441705)0_t3_lBl 


1163 


|3083 


i 75 i 


P 31 1 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


35196902_t3_129 


|11S4 


| 3084 


| |tu 


1851 | |631 


|1.2e 


-61 


Protein name 










Locus Name 




ACC# 


L-iactate permease 


(IctP) 


nomoiog 




1 


pir:F69350 


F69350 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


35335i37_cl_175 


|1165 


j |3085 


| 330 


1993 I 1302 


|8.7e 


-27 


Protein name 










Locus Name 




ACC# 



Description 



sp : HOLB_PSEAL! 



P52024 



DNA POLYMERASE III, 


DELTA 1 


5UBUNIT, 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


35353443_c2_215 


1166 


3086 1 


347 


|1044 | |1128 


2.6e-I14 | 


Protein name 








Locus Name 


Acc# 



sp:PTOA_VIBPA 



P40607 



Description 

ADENYLOSUCCINATE SYNTHETASE, ( IMP- -ASPARTATE LIGASE) 



ORF Name 



NTID 



AAID 



35603128 t3 1B0 



][ 



T01T7~ 



NT 
Length 

] EZZ 



AA 

T — *-v. Score 
Length 



Protein name 

Description 

NO-HIT 



Locus Name 



Probability 



ACC# 



314 



ORF Name 



NTID 



AAID 



NT 



AA 



— . „ — . Score Probability- 
Length Length *- 



3914143_c2_209 




| jiiSfl 3088 


1 ^ 1 


1558 256 


2.ie 


-29 


Protein name 










Locus Name 




Acc# 


ExbB protein 


|gp:BPEi3274i 


AJ132741 


Description 
















Bordetella pertussis nupB, tonB, 


exJDB, exJoD ana 


JDasR genes and 




0RF1 (partial) . 
















ORF Name 




NTID AAID 


NT 
Length 


AA 

T — . 1 Score 
Length ■— - 


Probability 


3914811_c3_280 




| |ii69 | |3089 


370 | 


|iii3 | 873 


2 .7e 


-87 


v~ /~i t~ ^ Y"l T~ 1 — 1 ryi i-\ 

riotcin Ilct lllc 










Locus Name 




Acc# 












sp:YHCM_ECOLI 


P46442 


Description 
















| HYPOTHETICAL 43 


1 


KD PROTEIN IN 


RPLM-HHOA INTERGENIC REGION 


(F375) 


1 


ORF Name 




NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


3937518_ci_17i 




| |1170 | 3090 


1 P 64 1 


|795 | |£18 


2 .9e- 


-60 


Protein name 










Locus Name 




ACC# 












sp:YGIl_PSEPU 


P31856 


Description 
















HYPOTHETICAL 28 


5 


KD PROTEIN IN 


3IDB-UNCI INTERGENIC REGION 






ORF Name 




NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


33388i8_ci_166 




| |Ii7I | |309i 


1 p 1 


|198 | 228 


|6.1e- 


1 


Protein name 










Locus Name 




Acc# 


PurA 




gp:AF010189 


AF010189 



Description 



Pseudomonas stutzeri Htic (htiC) gene, partial cds; Hisx (nisX) gene # 
complete cds; and PurA (purA) gene, partial cds. 



ORF Name 



NTID 



AAID 



3946931 t3 114 



TT7T 



][ 



NT AA 
Length Length 
TTZ — 



Score Probability 



Protein name 



|65i I |550 I |4.0e-54 ~ 

Locus Name Acc# 



gpTEO" 



ECU89166 



U89166 



Description 

j EiJcenella corrodens lysine decarboxylase (ECORLD) gene, completecds . 



315 



ORF Name 



13960260 tl 23 



Protein name 



NT ID AAID 



TT7T 



NT AA 

— — Score 

Length Length 

7ST 



1TTF 



Locus Name 



probable transcription regulator 



foir:T34763 



Probability 
|1.2e-2 4 

Acc# 
T34763 



Description 

ORF Name 
|4022217J:2_6a 



NT ID 



AAID 



TTPT 



][ 



I3TTST" 



] 



NT AA 
Length Length 

] EEZ 



Score 



] EZI 



Probability 
0.00087 



Protein name 



Locus Name 



nypotneticai protein F53A9.8 



E 



ir:T16439 



Acc# 
T16439 



Description 



ORF Name 



NTID AAID 



4023443 c3 278 



TTTET 



][ 



NT AA 
Length Length 



Score 



I |441 | [498 | 



Protein name 



Locus Name 



Probability 
|l.be-47 

ACC# 



|sp:NDK:_PSEAE 



Q59636 



Description 

NUCLEOSIDE DIPHOSPHATE KINASE, (NDK) (NDP KINASE) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|4101387_±i_4 


1176 


3096 


219 


660 


|71 0 | 


|5.1e 


-70 


Protein name 








Locus 


Name 




Acc# 










Isp : TRPG_PSEAE 




P20576 


Description 
















TRANSFERASE) 


ORF Name 


NTID 


AAID 


NT 

Length 


AA 
Length 


Score 


Probability 


4145000_t2_47 


| 1177 


| |3097 


i F s i 


1248 | 


962 | 


l.Oe 


-96 | 



Protein name 

Description 
ABC TRANSPORTER ATP,- BINDING PROTEIN OOP- 1 



Locus Name 



|sp:OOPl_HAEIN 



Acc# 

Q57242:O05 
056 



316 



ORF Name 


NT ID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


|4181502_t3_149 


1178 


| 3098 


| 130 


393 




Protein name 








Locus Name 


ACC# 


Description 












[NO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. . , Score 
Length 


Probability 


|4182762_r2_51 


| |1179 


| |io yy 


i p b ° 


|1077 | |374 | 


|1.9e-49 | 


Protein name 








Locus Name 


Acc# 


tryptophan- -tRNA ligase, 






pir:H70385 


| H70385 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|424203_tl_24 


| 1180 


3100 


i r 


J1053 | 542 | 


3.2e-52 


Protein name 








Locus Name 


Acc# 


putative exodeoxyrifconuclease (EC 


3.1.11.2) . 


gp:SCE8 7 


AL132674 


Description 


| streptomyces 


coelicolor cosmid E87 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4331262_t3_141 


1181 


3101 


222 


669 402 


2.2e-37 


Protein name 








Locus Name 


Acc# 


probable corA 


protein 






pir :F70952 


F70952 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4506930_cl_i74 


| |1182 


| |3102 


|35S 


1071 | |504 | 


|3.4e-48 | 


Protein name 








Locus Name 


ACC# 



sp:LPXK_HAEIN 1 P44491 



Description 

I ' M ' RAACVLDISACCHARIDE 4 ' -KINAS E , (L I PID A 4 1 -KINASE) 



317 



ORF Name 
|4537837_t3_145 



NT ID 



AAID 



TT3T 



13103 



NT AA 

— — Score 

Length Length 

2T5 1 



Probability 
|2.5e-45 



Protein name 



Locus Name 



YciB homolog 



gp:AF114793 



Acc# 
AF114793 



Description 



Vitreoscilla sp. YciB Jiomolog, putative transcriptional activator , putative 
outer membrane protein, BioA homolog, and glutaminesynthetase homolog genes, 
complete cds ; and unknown genes . 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


457I880J:i_I6 


| 1184 | |3104 


1 1 216 1 


i" 1 i r i 


|4.3e-4i | 


Protein name 






Locus Name 


Acc# 


YJDeZ protein 






|gp:5TY249116 


AJ249116 


Description 


Salmonella typn 


Lmurium yleB (partial) , miaB, 


yoez ana yt>eY (partial) genes. | 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


j4722125_c2_2il 


1185 | 3105 


1 * bb 1 


1398 |1197| 


J1.3e-12i 


Protein name 






Locus Name 


Acc# 








sp:Y325_HAEIN 


P44640 


Description 










HYPOTHETICAL PROTEIN HI 032 5 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


j4798763_cl_155 


| jilBS | 3106 


141 | 


|426 | |274 


|8.1e-24 | 



Protein name 



Locus Name 



ExbD protein 



:BPE132741 



Acc# 
AJ132741 



Description 



Bordetella pertussis hupB, 
ORF1 (partial) . 


tonB, 


exJDB , ext>D 


and basR 


genes 


and 


ORF Name NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|4876005_rl_30 1187 


| |3107 


1 ^ 


2487 


1231 


| |3.1e-125 | 



Protein name 



Locus Name 



hypothetical protein TM1869 



pir : F72202 



Acc# 
F72202 



Description 



318 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4978375_c3_293 


| |I188 


3108 


382 


1149 261 


|7 . 7e 


-26 | 


Protein name 










Locus Name 




Acc# 


oeta-Ketoacy±-acyi 


carrier 


protein 


syntnase 


pir:B64545 




B64545 


III 






























Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 




| |1189 


1 13109 1 


|620 | 


|1863 | |1268| 


|3.8e 


-129 


Protein name 










Locus Name 




Acc# 












sp:KEFX_HAEIN 


P44933 


Description 
















ANT I PORTER) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

r — . i Score 
Length 


Probability 


|59765SbJ:3_lli 


| |1190 


| P 110 | 


|769 


|2310 | |3096 | 


|0.0 


1 


Protein name 










Locus Name 




ACC# 












sp:RIRl_ECOLI 




P00452 :P78 
088 :P78177 


Description 














(RIBONUCLEOTIDE REDUCTASE 


17 (Bl PROTEIN) (Rl PROTEIN) 




i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


636513J:3_i26 




1 1 3111 


l 8i 1 


246 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

r — , i Score 
Length 


Probability 


|5516_t2_45 


| 1192 


3112 


|361 | 


1086 |176 | 


|7.2e- 


1 


Protein name 










Locus Name 




ACC# 


nypothetical protein 






pir:S76259 




S76259 



Description 



319 



ORF Name NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 




201 


506 |105 | 


|0. 00055 


Protein name 




Locus Name 


Acc# 


pnospnoglycerate mutase 




pir :G72260 


G72260 


Description 








ORF Name NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


lgfl^7fl?7 rl lfil 1 11194 1 13114 1 


1155 1 
1 1 


1498 1 1145 1 


|3.8e-10 


Protein name 




Locus Name 


Acc# 






sp:Y400_SYNY3 


Q55129 


Description 








HYPOTHETICAL 18.3 KD PROTEIN SLL0400 | 


ORF Name NT ID AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


785578_c3_259 |1195 | 3115 


85 


|25i | |99 | 


|2.8e-05 


Protein name 




Locus Name 


Acc# 


unknown 




gp:AF1147yi 


! AF114793 


Description 


Vitreoscilla sp. YciB nomolog, putative transcriptional activator , putative 


outer membrane protein, BioA homolog, 


and glutaminesynthetase 


homolog genes, 


complete cds; and unknown genes. 








ORF Name NT ID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


959200_c3_270 | |1195 3116 


|234 


|705 | |350 | 


|5.2e-33 


Protein name 




Locus Name 


Acc# 






sp:GTDB_ECOLI 


; Pi7ii3 


Description 








glucose Inhibited division protein b j 


ORF Name NT ID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|970375_f2_57 |ii97 |3117 


279 | 


|840 |921 | 


|2.2e-92 | 


Protein name 




Locus Name 


Acc# 


probable GTP-binaing protein Hi03y3 




pir:I54150 


164150 


Description 



320 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


97653i_t2_59 


| |iisa ma 


171 


|B16 | 437 


4.3e 


-41 


Protein name 






Locus Name 




ACC# 


Yoez protein 






gp:STY249ii& 




AJ249116 


Description 




Salmonella typnimurium yieB (partial) , miaB 


, ybez and yoeY (partial) genes. 


ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Probability 


985925_t3_124 


| 1199 | 3119 | 


128 


pay | 






Protein name 






Locus Name 




ACC# 


Description 












MO-HIT 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


9954828_c2_208 


| |1200 | |3120 | 


322 


|9S9 |170 | 


|3.9e- 


-14 


Protein name 






Locus Name 




Acc# 


TonB2 






gp:AF190125 




AF190125 


Description 












Pseudomonas aeruginosa TonB2 (tonB2) 
complete cds . 


, exjdb 


(exoB) , and Exdd 


{ exoD ) genes , 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|I0975667_c2_74 


| |120i | |3121 | 


375 | 


| |b24 | 


2.6e- 


-50 


Protein name 






Locus Name 




Acc# 


thiamine -monopnospnate Kinase 




gp:D17333 




D17333 


Description 












E. con tniL gene, 


complete cas . 










ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


1250008i_c3_90 


|1202 | 3122 


270 


|813 | 855 


2.2e- 


•85 



Protein name 

Description 
HISF PROTEIN (CYCLASE) 



Locus Name 



sp:HIS6_AZ0BR 



ACC# 
P26721 



321 



ORF Name 



114572127 ci 50 



Protein name 



NT ID 



AAID 



T7UT 



7T7T 



NT 
n 
T55 



AA 

— Score 
Length Length 



Probability 
l.le-40 



Locus Name 



sp:RISB_ECOLI 



Description 

(LUMAZINE STHTTHSgEl (RIBOFLAVIN SVNTHASE BUTA CHAIN) 



Acc# 

P25540:P77 
114 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


197212_cl_54 


| |1204 


| 3124 


1 284 1 


i 852 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT J 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . _ Score 
Length 


Probability 


2175701i_t3_32 


|1205 


| |3125 


1 


954 |736 | 


|8.9e-73 | 



Protein name 



Locus Name 



YarJ 



[gp:NGAJ2783 



Acc# 
AJ002783 



Description 

Neisserxa gonorrnoeae aroK, aroB, yatJ genes and open readingtrame. 



ORF Name 



|2364b2fc3 cJ 



NTID 
"j [1206 



AAID 



TTZG~ 



NT AA 
Length Length 

] F 1 I [ 



Score Probability 
J |190 | |2.4e-30 ~ 



Protein name 



Description 



Locus Name 



sp:PGPA_HAEIJN 



ACC# 
P44157 



PH05 PHAT ID YLGL YCEROPHOS PHATA5 E A, r 


ORF Name NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|23916007_c3_94 | 1207 pi27 


196 | 


I 591 1 


|273 


l.Oe-23 1 



Protein name 



Locus Name 



metnylase 



|gp:LLC!PJW56b 



Acc# 
Y12736 



Description 

Lactococcus lactis cremoris piasmia pJW565 dna, abiiM, abuR genesand ortx. 



322 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|23947157_t3_37 


1208 | 3128 


653 


1952 755 


6. Oe-87 | 


Protein name 








Locus Name 


Acc# 


penicillin -binding protein 3 




pir:S54872 


S54872 


Description 












ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|242S56B7_i2_23 


| |1209 | |3129 


i r 1 i 


|1505 | |537 | 


|i.Be-V9 | 


Protein name 








Locus Name 


ACC# 










sp:MimF_ECOLI 


1 P11880:P77 


Description 










636:007100 


(D-ALANYL-D- ALANINE -ADDING ENZYME ) j 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|24353377_ci_4!> 


| |1210 | 3130 


i i 


pa | |ibb | 


|3.3e-ll | 


Protein name 








Locus Name 


Acc# 


hypothetical prot 


ein PAB0131 




pir :D75209 


D75209 



Description 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


263690i6_i2_i8 


| 1211 


| J3131 


i" 9 i 


900 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — ■ i Score 
Length 


Probability 


|339733S_ci_49 


| |1212 


| J3132 


i" i 


|198 |50 | 


(0.037 


Protein name 








Locus Name 


Acc# 



sp:DHSD_PORPU 



P80479 



Description 
DEHYDROGENASE, SUBUNIT IV) 



323 



ORF Name 


NT ID 


AAID 


NT 
Length 


„ — . , Score 
Length 


Probabi 1 i ty 


135968792 f2 24 


1213 


3133 


201 


606 346 


1.9e-31 


Protein name 








Locus Name 


Acc# 










sp:TPIS_MORSP 


| Q01893 


Description 












OS E PHOSPHATE 


I SOME RASE, 


(TIM) 








ORF Name 


NTID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


|3907500_tl__5 


1 1" 14 


| 31.4 


1 1 


|1026 | |227 


|9.2e-18 j 



Protein name 



Locus Name 



homos er me Jcinase homo log 



lpir:T33726 



ACC# 
T33726 



Description 



ORF Name 



NTID 



AAID 



NT 



AA 



Length Length 



Score Probability 



13939665 ci 51 



Protein name 



] [1215 | [3135 | |1S3 | |552 | |204 | |2.ie-16 

Locus Name 



sp:NUSB_HAEIN 



Acc# 
P45150 



Description 

N UTILIZATION SUBSTANCE PROTEIN B H0M0L0G (NUSB PROTEIN) 



ORF Name 



NTID 



AAID 



NT 



AA 



Length Length 



Score 



Probability 



13953191 ci 44 



Protein name 



1216 | [3136 | |502 | [1509 | p^uT] [7.7e-154 

Locus Name 



glutamyl -tRNA synthetase 



:AE139107 



Acc# 
AF139107 



Description 



Pseudomonas aeruginosa hypothetical multidrug resistance protein (mdr) gene, 
partial cds; hypothetical transcriptional activator (act) and glutamyl - tRNA 
synthetase (gltX) genes, complete cds; andtRNA-Ala and tRNA-Glu genes, 
complete sequence. 



ORF Name 
141703 t3 36 



NTID AAID 



11217 



JUT 



NT AA 
Length Length 
T2T5 — 



Score Probability 



Protein name 
Description 
[NO-HIT 



Locus Name 



Acc# 



324 



ORF Name 


NX ID 


7\ 7\ Tn 

AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


4301943_tl_13 


1218 






1107 | [1018 | 


1.2e 


-102 


Protein name 










Locus Name 




Acc# 












sp :MRAY_HAEIN 


P45062 


Description 
















(UDP-MURNAC- 


PEMTAPEPT115W 


PHOS PHOTRANS FERAS E ) 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|5111318_t2_22 


1219 


|3139 




|1572 | |756 | 


6 . 8e 


-75 


Protein name 










Locus Name 




Acc# 


probable 




gp:AF1418S7 


AF141867 



Description 



Vibrio cnoierae ~ — ~" 
probableUDP-N-acetylmuramoylalanyl-D-glutamate--2, 6-diaminopimelate 
ligase(murE) gene, complete cds . 



ORF Name 



NTID 



AAID 



5754052 ±3 35 



][ 



1220 



][ 



NT AA 
Length Length 
TTZ 1 [TuTT 



Score 



Probability 
2.0e-S8 



Protein name 



Description 



Locus Name 



sp:YABC_ETOLI 



Acc# 
P18595 



HYPOTHETICAL 34 


5 KD PROTEIN IN FRUR 


-PTSL INTERGENIC 


REGION 


(ORPB) 1 


ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


111552_ti_10 


1221 3141 


73 


222 | 






Protein name 






LOCUS 


> Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|1189035_c3_48 


|1222 3142 | 


|179 


p" i 


516 | 


4.7e-50 



Protein name 



Locus Name 



adenylate Kinase 



|gp:AB02442*> 



ACC# 
AB024426 



Description 

I Pseudomonas putiaa adJc gene tor adenylate kinase, complete cds. 



325 



ORF Name 



112578208 Tl 15 



Protein name 
Description 



NT ID 
11223 



AAID 



1TTT" 



NT AA 
Length Length 
JXZ 1 [TTFT 



— . , Score 



11244 



Probability 
|1.3e-125 



Locus Name 



sp:DHAS_t>SEAE 



Acc# 
Q51344 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23444507_c3_45 


| 1224 


| |3144 


452 | 


|L359 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — Score 
Length 


Probability 


23536511_r2_16 


| |1225 


1 1 3145 


H« 1 


|1017 | |210 | 


|4.1e-15 



Protein name 



Description 



Locus Name 



sp:ASGl_ECOLl 



Acc# 
"j P18840 



(L-ASNASE 1) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|245682_tl_8 


| |I226 




i p« 


| p9 | 


|8.be-68 


Protein name 








Locus Name 


Acc# 



Description 



|sp:TRUA_ECOLI 



P07649 



1} (PSEuDOURIDINE 


SYNTHASE I) (URACIL HYDROLYASE) IPSU-IJ 




ORF Name 


NT 

NTID AAID . — . , 
Length 


AA 

. — . , Score 
Length 


Probability 


34157662_C2_41 


| 1227 |3147 203 | 


|612 | 321 


8.5e-29 | 


Protein name 




Locus Name 


Acc# 



spTTTFBTSEPE 



P52237 



Description 
BIOGENESIS PROTEIN TTPB1 



326 



ORF Name 


NTID 


AAID 


NT 


— Score 
Lencrfch 


Probability 


|4042131_ci_33 


1228 


3148 


70 


P" 1 




Protein name 








Locus Name 


Acc# 


Description 












NO -HIT j 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|4112793_cl_37 


|1229 


| 3149 


i p j i 


[1272 | |175 | 


p.le-10 | 


Protein name 








Locus Name 


Acc# 



sp:CCMH_HAEIN 



P46458 



Description 

CYTOCHROM E C-TYPE BIOGEN E S I S PROTEIN CCMH PR ECURSOR 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|4484436_c2_40 


| |1230 


| pi50 


1 1^ 1 


|2079 | 




|1.7e-179 | 


Protein name 








Locus 


Name 


ACC# 










sp : CCMF_PSEFL 


P52225 


Description 














CYTOCHROME C-TYPE 


BIOGENESIS PROTEIN CYCK 






1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|S26S800_il_9 


| |123i 


i i iibi 


1 1" i 


| 23 4 | 


pV4 | 


|8.1e-24 | 



Protein name 



Description 



Locus Name 



|sp:IFl_BAC5U 



Acc# 
P20458 



TRANSLATION 


INITIATION FACTOR lt'-l 










ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


587775_cl_3<!> 


1232 3152 


172 1 


(519 | 


299 1 


|1.8e-26 



Protein name 



Locus Name 



sp:CCMH_ECOLI 



Acc# 
P33925 



Description 

CYTOCHROME C-TYPE BIOG E NESIS PROTEIN CCMH PRECURSOR 



327 



ORF Name 



NTID AAID 



TTTT" 



7T5T 



NT AA 
Length Length 
TUZ 



Score Probability 
13 09 | |1.6e-27 — 



Protein name 



Description 



Locus Name 



VHHP ECOLI 



Acc# 
P10120 



21.7 KD PROTEIN 


IN FTSY-NIKA INTERSENIC REGION 




i 


ORF Name 


NT AA 
NTID AAID _ — . , . — 

Length Length 


Score 


Probability 


1058462_c3_105 


| |1234 | 3154 | 293 | 952 | 


i" i 


|0.032 | 



Protein name 



Locus Name 



15 JcDa vesicular- like antigen 



gp:PFAVLAP 



Acc# 
M94732 



Description 

Plasmodium falciparum 15 KDa vesicular-like antigen gene, exons ltnrougn 4 . 



ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|13688802_c2__10i 


p^S 1 3155 


i " 


|231 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


14644586_£2_28 | 


1236 | J3I56 


1 1 


|1I76 | |465 | 


|4.7e-44 | 


Protein name 








Locus Name . 


ACC# 


36 JcDa protein 


gp:HPU86610 


U86610 


Description 




Helicobacter pylori 


3 6 JcDa protein 


gene, complete eels . 




ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|15132787_±3_50 | 


1237 | J3157 


| 105 


P" 1 I 197 1 


|1.2e-15 | 


Protein name 








Locus Name 


Acc# 










sp: YDCQ_ECOLI 


" P76107 



Description 

HYPOTHETICAL 16. i KD PROTEIN IN TKHB-AHSP 1NTERGENIC REGION 



328 



ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


19532661_t2_33 


|1238 | |3158 


77 


234 




Protein name 






Locus Name 


Acc# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|20335427_c2_78 


| |1239 | 3159 


1 P ' 1 


|1974 | 239 | 


2.0e-17 


Protein name 






Locus Name 


Acc# 


minor tail protein gp26 -related protein 


pir :F7b60b 


F75605 


Description 










ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


21642510_c2_77 


1240 | |3160 


i r i 


pio | 





Protein name 
Description 



Locus Name 



ACC# 



MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


2i909377_i2_29 


| 1241 


1 I 3151 


r i 


1206 


261 


3. 0e-2l 



Protein name 



Locus Name 



hypothetical protein jhpl380 



pir:G71815 



Acc# 
G71815 



Description 



ORF Name 



NTID 



AAID 



22266577 £2 21 



1242 



TTST 



Protein name 



NT 
n 

77U 



AA 

— — Score 
Length Length 



thiamine-phosphate pyrophosphorylase 



Description 



Locus Name 



(gp:AP180145 



Probability 
1.8e-19 



ACC# 
AF180145 



Zymomonas mobilis GTP-Pinciing protein CgpA (cgpA) , 6 OKD inner -membrane ~~ 
protein yidC (yidC) , hypothetical protein, glutamine -pyruvate aminotransferase 
gltB (gltB) , glutamate synthasesmall subunit gltS (gltS) , undecaprenol kinase 
udk (udk) hypothetical protein, NADH dehydrogenase, hypothetical 
protein; zml2orf 5, hypothetical protein, aspartate aminotransferase 
A, beta- hydroxys teroid dehydrogenase , phosphomannomutase pmm 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


Z Z O UO 1 ~> X X .7 


1 11243 1 3153 


165 


498 222 


2 . 6e 


1 


Protein name 








Locus Name 




ACC# 










sp:TOLR_PSEAE 


P50599 


Description 














TOLR PROTETM | 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


2 3444426_r3_4 5 


|1244 | (3154 


|419 


|1260 | |642 | 


[8T3e 


-94 J 


Protein name 








Locus Name 




Acc# 


1 ATP- dependent 


neiicase HrpA nomoiog. 






gp:D90779 


D90779:D90 
761:AB0013 
40 


Description 












| E.coli genomic 


Dna, Kohara clone #268(31.6-32.0 


mm. ) . 




i 


ORr Name 


NT ID AAID 


NT 
Length 


AA 

T — i i Score 
Length 


Probability 


124021016 fc3 4J 
1 - - 


| |124b | |3165 | 


I 1195 1 


3588 | 2225 


|1.6e 


-266 | 


Protein name 








Locus Name 




Acc# 










sp:MFD_HAEIN 




P45128 


Description 














| TRANSCRIPTION- 


REPAIR COUPLING FACTOR (TRCF) 








i 


ORF Name 


NTID AAID 


NT 
Length 


— , Score 
Length 


Probability 


|24401887_cl_62 


| (1246 | |3166 | 


|114 | 


|34b | |Ub | 


|2 . 9e- 


i 


Protein name 








Locus Name 




Acc# 








gp:AB030825 




AB030825 


Description 














1 Pseudomonas aeruginosa genomic dna, 


partial sequence, strain: 


PA01. 


1 


ORF Name 


NTID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|25554561_12_20 


| |1247 | |3167 | 


I 151 


456 |94 | 


[0.0015 | 


Protein name 








Locus Name 




Acc# 


nypotnetical protein phiuui 




pir :D71092 




D71092 



Description 



330 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


12994032 c2 82 


" JT248 


| 3158 


254 


795 


I 307 1 


|9.5e-35 


Protein name 








Locus Name 


Acc# 


minor tail protein 


gpis 






pir:T13105 


T13105 


Description 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|319iS532_li_8 


| |1249 


| 3169 


1 F 


I 189 






Protein name 








Locus Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|3320327_c2_76 


| (1250 


| |3170 


1 y4 1 








Protein name 








Locus Name 


Acc# 


Description 














pTO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


34188892_ci_Sl 


1251 


| 3171 


573 


2022 ] 


227 


4.5e-15 


Protein name 








Locus Name 


Acc# 



Description 



sp:VG25_BPMD2 



064220 



MTN0R TATL 


PROTEIN <3E>2S 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


34415711_tl_ 


.10 1252 


| |3172 


1 F y 1 


|1107 | 


|288 | 


|2.7e-25 | 



Protein name 



Locus Name 



conserved hypothetical integral membrane 
protein HP1486 



pir :F64705 



Acc# 
F64705 



Description 



331 



ORF Name 



NTID AAID 



1 3536 7 058 fi 51 



IT75T 



][ 



TT7T 



NT AA 

— — Score 

Length Length 

7T 



TIT 



Probability 
8 . 8e-ll 



Protein name 



Locus Name 



sp:YDCQ_ECOLI 



Acc# 
P76107 



Description 

HYPO T HETICAL 16.1 KB PRO T EIN IN TEHH-ANSP 1NTURGENIC REGION 



ORF Name 



135942905 ±2 19 



^ NTID 
■j [1254 



AAID 



\JTTT 



NT AA 
Length Length 
P35 1 1471 



Score Probability 
12 78 | |3.ie-24 ~ 



Protein name 
Description 

HYPOTHETICAL TRNA/RRNA METHYLTRANiSFERASE YIBK, 



Locus Name 



sp:YIBK_ECOLI 



Acc# 
P33899 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


36118750_c2_104 


1255 


i p»« 




234 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|36328956_t2_23 


| |1256 


1 F 7t 


i i iob i 


pav | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|3944450_c2_93 


| |1257 


| 3177 


i ^ i 


|699 | |422 | 


|1.7e-39 



Protein name 



Locus Name 



TolQ protein 



|gp 



: PPPAL1 



ACC# 
X74218 



Description 

Pseudomonas putida ruvB, tolQ, tolR, tolA, tolB and oprL genes. 



332 



ORF Name 
|4460587_i: rjST 



NT ID AAID 



JT7W 



NT AA 
Length Length 
513 1 11545 



Score 



Protein name 



Locus Name 



hypothetical protein jhpl382 



pir:A7181S 



Probability 
|3.4e-17 



ACC# 
A71816 



Description 
ORF Name 



14507703 52 103 



1 



NTID AAID 

i 



AA 

— Score 
Length Length 



TZ5T 



\TT7T 



NT 

] EE 



] [ 



TOT" 



T5T 



Protein name 



Description 



Locus Name 



Probability 

p.0e-ll 

Acc# 



sp:Y014_BPHPi 



P51716 



HYPOTHETICAL 14 . 9 KD PROTEIN IN REP-HOL INTERGUN1C REGION (ORF14) 



ORF Name 



14728415 c3 120 



Protein name 



NTID 



AAID 



][ 



NT AA 
Length Length 



: el 



Score Probability 
11 03 | [0.030 ~ 



Locus Name 



ras interacting protein RlPA 



|gp:AP159241 



Acc# 
AF159241 



Description 



Dictyostelium discoicleum ras interacting protein RlPA (ripA) mRNA, complete 
cds. 



ORF Name 



NTID 



AAID 



14730050 cl 73 



[1251 | 



7TET 



NT AA score 

Length Length 



Probability 
| [1320 | [375 | |1.6e-34 



Protein name 



Locus Name 



TolB 



gp:HIU32470 



Acc# 
U32470 



Description 



Haemophilus intluenzae tolQRAB gene cluster, inner membrane protein (toIQ) 
gene, partial cds, inner membrane protein (tolR) , outermembrane integrity 
protein (tolA) and colicin tolerance protein (tolB) genes, complete cds. 



ORF Name 



NTID 



AAID 



5282805 Cl 63 



TZZT 



JTZT 



NT AA 
Length Length 
2T7 — 



Score 



[684 | |325 | [ 



Probability 
1 . 2e-29 " 



Protein name 



Locus Name 



minor tail protein L homolog : protein gpl8 



pir:T13104 



Acc# 
T13104 



Description 



333 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|5348393_c2_83 




3183 


79 


240 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|682777_c2_79 


| |1254 


|3184 


|139 


p> | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|7255950_£iJ7 


| put 


| |3185 


1014 


|3045 | |725 | 


3.1e-134 


Protein name 








Locus Name 


ACC# 



sp:HRPA_ECOLI 



Description 






1 P43329:P77 

479:P76861 
:P76863 


ATP -DEPENDENT 


HELICA5E HRPA 






ORF Name 


NT 

NTID AAID , — . , 
Length 


AA 

, — . , Score 
Length 


Probability 


|24113927_t2__i 


1255 3185 334 


|1005 115 


0.0012 


Protein name 




Locus Name 


Acc# 


1 STARP antigen 




gp : PRSTARPA 


Z30339 


Description 


P . reiclienowi 


STARP gene tor STARP antigen. 




i 


ORF Name 


NT 

NTID AAID „ — , 
Length 


AA 

— Score 
Length 


Probability 


|25573905_±1_1 


|1257 |3187 | |205 | 


pi | UJb I 


7.0e-4i 



Protein name 

Description 
INTERGENIC REGION 



Locus Name 



|sp:VVCP_BACSU 



Acc# 
P37478 



334 



ORF Name 



NT ID 



AAID 



NT AA 
— , — , Score 
Length Length 



Probability 



29927207 E3 4 



2.4e-05 



Protein name 



Locus Name 



probable two component sensor protein 



foir:C70624 



Acc# 
C70624 



Description 



ORF Name 



13521030b ti 2 



NTID 
■] [1269 



AAID 



AA 

— , Score 
Length Length 



[3T5"9~ 



NT 
n 
3W 



Probability 



903 | |155 | |2.3e-08 



Protein name 



Locus Name 



SmeS 



E 



:A F 173226 



Acc# 
AF173226 



Description 



Stenotropnomonas maitopnilia multiarug ettlux system smeR (smeR) , smes 
(smeS) , SmeA (smeA) , SmeB (smeB) , and SmeC (smeC) genes , complete cds . 



ORF Name 



NTID 



AAID 



12938586 c3 89 



11270 



3TW 



NT AA 

— , — , Score 

Length Length 

T5T 



VST 



Probability 
3.8e-26 



Protein name 



Description 



Locus Name 



sp : PAL_PSfc!PU 



Acc# 
P43036 



P E PT I DOCLYCAN- ASSOCIA T ED LIPOPROTEIN PRECURSOR 



ORF Name 



NTID 



AAID 



NT 



AA 



— — Score Probability 



Length Length 



14237555 cl 39 



][ 



TTTT 



TT5T 



S3" 



] [ 



1 



Protein name 
Description 



Locus Name 



ACC# 



[NO-KIT 



ORF Name 



NTID AAID 



— — Score Probability 



14492157 t3 31 



\TT7T 



][ 



TTTT 



NT AA 
Length Length 
T75 1 [S"3T 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT 



335 



ORF Name 



NTID AAID 



AA 

— Score 
Length Length 



14875327 F2 18 



TUT 



TT5T 



NT 
n 



T5T74" 



1 1592 J 



Probability 
1.7e-163 



Protein name 



Locus Name 



membrane alanyl ammopeptiaase 



|gp:AFi57493 



Acc# 
AF157493 



Description 



Zymomonas mob i lis 


ZM4 tosmicl clone 


42D7, complete sequence. 




ORF Name 


NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


|156258_c3_90 


|1274 | |3194 


| |199 |500 | |125 | 


|2.6e-06 


Protein name 




Locus Name 


Acc# 


NrpG 




|gp:PMU46488 


U46488 



Description 



Proteus mirabilis NrpS (nrpS) gene, partial cds, NrpU (nrpU) , NrpT(nrpT), 
NrpA (nrpA) , NrpB (nrpB) , NrpG (nrpG) and IrpP (irpP)genes, complete cds. 



ORF Name 
|15180387_i3_35 



NTID AAID 
] [1275 



NT 
Length 
|384 



AA 

, — L1 Score Probability 
Length 



Protein name 



[1155 | |143 | |3.8e-07 ~ 

Locus Name Acc# 



nypotnetical protein RP367 



|pir:H7i593 



H71693 



Description 



ORF Name 



15507575 c2 54 



NTID 
j [1275 



AAID 



NT 
Length 

1 



AA 
Length 
11017 



Score Probability 
1445 | |5.1e-42 — 



Protein name 



Description 



Locus Name 



sp:SMTA_ECOLI 



Acc# 

P36566:P77 
586 



1 SMTA PROTEIN 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


22116325_r2_14 


| |1277 


|3197 


i 105 i 


|330 | |20S | 


|1.7e-15 | 


Protein name 








Locus Name 


Acc# 



sp:PAi_KLEPN 



P37446 



Description 

I ACYLHYDROLASE ) (OUTER MEMBRANE PH0SPH0LIPA3E A) (OM PLA) 



336 



ORF Name 



NT ID AAID 



22634055 ti 11 



3198 



NT AA 
— — Score 
Length Length 

11865 



Probability 
12 .le-192 



Protein name 



Description 



Locus Name 



sp:CLPB_HAEIN 



Acc# 
P44403 



CLPB PROTEIN 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


23495633J:2JL5 


|1279 


j 3199 


i 345 i 


1050 | 


i 155 1 


(0.0072 | 



Protein name 



Locus Name 



ComB 



] |gp:AF027189 



Acc# 
AF027189 



Description 

Acinetobacter sp. BD413 lytB, comB, 
and unknown genes. 



come, comE, and comF genes , complete cds; 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


2376890_c2_56 


|1280 


3200 


90 


p" i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


24395640_i:3_38 


|1281 


3201 


| 282 


|849 | |291 | 


|1.3e-25 


Protein name 








Locus Name 


Acc# 


ABC transporter potG 






pir:B71694 


B71694 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


24S4383i_ti_3 


|1282 


| |3202 


346 | 


1041 213 | 


3.2e-15 


Protein name 








Locus Name 


Acc# 



gp : CCPLDA 



Y11031 



Description 
C.coli pi OA gene. 



337 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length - ■ 


Probability 


124783453 t2 25 




|3203 


224 | 


1575 1 1738 1 


|5.5e-73 


Protein name 








Locus Name 


Acc# 










sp:CLPB_BACNO 


| P17422 


Description 












CLPB PROTEIN 1 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|25817157_t3_34 


1284 


|3204 | 


250 


|753 | 319 | 


|1.4e-28 


Protein name 








Locus Name 


Acc# 


hypothetical prot 


ein 






bp : AHWAAA1 7 9 


1 Z96927 


Description 


Acinetofcacter naemolyticus waaA gene 


, strain ATCC 17906. 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|2995252_13_37 


1285 


3205 | 


342 | 


1029 |198 


1.4e-13 


Protein name 








Locus Name 


Acc# 


1 ct3 9i nypotneticai protein 






] pir:G72072 


| G72072 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


32703126_t2_22 


| |1285 


| 3205 


304 


|915 | |359 | 


|5.9e-34 | 


Protein name 








Locus Name 


Acc# 


nypotneticai protein RP368 






pir:A71594 


A71694 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|3536894IJ:i_4 


J1287 


1 3207 1 


285 | 


|858 | 152 


|4.5e-09 | 


Protein name 








Locus Name 


Acc# 


competence protein ComF 






] gp:PST249742 


AJ249742 



Description 



Pseudomonas stutzeri JM300 bioB (partial) , comF and dot (partial ) genes . 



338 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 


Probability 


|35943885_c2_6S 


| 1288 


[3208 


413 


|1242 | 




Protein name 








Locus Name 


Acc# 


Description 












INO-H1T 




OR F Nam? 


NT ID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Pirobabil itv 


|4111633J:2_13 


1 11289 
1 1 


1 13209 1 
1 1 1 


1154 1 


1465 1 
■ ' 




Protein name 








Locus Name 


ACC# 


Description 












pO-HIT 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4i42i8S_£i_2 


| |1290 


|3210 | 


246 


|741 | |777 | 


|4.0e-77 | 


Protein name 








Locus Name 


A.cc# 










sp:RNPH_P5EAE 


P50597 


Description 












NUCLEOTIDYLTRANSFERASE) 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|4572203_ti_9 


1291 


|3211 


329 


990 117 


5.7e-05 


Protein name 








Locus Name 


Acc# 


| merozoite surtace 


antigen 2 




gp:U916bS 


U91655 


Description 












Plasmodium falciparum isolate V310, 
partial cds . 


merozoite surtace antigen 2(MSP-2) gene, 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4797282_c2_74 


| |1292 


| |3212 


67 


204 




Protein name 








Locus Name 


Acc# 



Description 
|NO-HIT 



339 



ORF Name 



15082637 t3 33 



Protein name 



NTID 



AAID 



TZTT 



NT AA 

— — Score 

Length Length 

WZQ 1 fTTZT 



] i 



or 



Probability 
|4.5e-S2 



Locus Name 



WaaA 



gp:AF026386 



Acc# 
AF026386 



Description 



Salmonella typnimunum strain LT2 LPS core oiigosacchariciebiosynthesis = 
region, WaaY (waaY) gene, partial cds; WaaJ (waaJ) ,WaaI (waal) , WaaB (waaB) , 
WaaP (waaP) , WaaG (waaG) , and WaaQ (waaQ) genes, complete cds; and WaaA (waaA) 
gene, partial cds. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


513283b_tl_5 


|1294 | 


|32i4 


if 55 i 


p" i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


5867075_13_29 




|3215 


i r' 2 i 


|609 | |105 | 


10.00049 


Protein name 








Locus Name 


Acc# 


pilv protein 


pir:577594 


S77594 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — L , Score 
Length 


Probability 


790700_c2_55 




|32I6 


i r i 


1140 | |151 | 


|9.2e-08 


Protein name 








Locus Name 


Acc# 


hypothetical 


protein TP0565 






pir:C71308 


C71308 



Description 
ORF Name 



9775283 cl 46 



NTID 
] [1297 



AAID 



[3TTT 



NT 
Length 
|499 



AA 

_ — Score 
Length 



11500 



|459 | 



Protein name 



Locus Name 



probable alginate O-acetylation protein 
(algl) 



pir :D71308 



Probability 
|1.8e-44 

Acc# 
D71308 



Description 



340 



ORF Name 


NTID AAID 


NT . 
Length 


, — . , Score 
Length 


Probability 


115827 cl 7 


1298 3218 


330 | 


1993 1 1877 1 


|l . oe-87 


Protein name 






Locus Name 


Acc# 








sp:<3LMtMHAEIK 


P43889 


Description 










ACETYLGLUCOSAMINE - 1 


-PHOSPHATE URIDYLTRANSFERASE) 




ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|15828178 £1 2 | 


|1299 | [3219 


1 I 6 " 1 


J1851 | J2166 | 


|2.6e-224 | 


Protein name 






Locus Name 


Acc# 








sp : TYPA_HAEIN 


P44910 


Description 










CTP-BINDING PROTEIN 


TYPA/BIPA HOMOLOG 






ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|326564b'5_c2_10 | 


J1300 | (3220 


i vi) i 


TTJ 1 




Protein name 






Locus Name 


ACC# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


, — . , Score 
Length 


Probability 


3336053_rl_l | 


(1301 | (3221 




|402 | |618 | 


|2.9e-50 | 


Protein name 






Locus Name 


ACC# 


outer membrane protein CD precursor 


pir :S3y866 


S39866 


Description 










ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|10975831_c3_12 | 


|1302 | (3222 


ip i 


|279 | 




Protein name 






Locus Name 


Acc# 



Description 
[MO -HIT 



341 



ORF Name 


NTID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


15912757_ci_8 


1303 3223 


123 


3 


72 86 


0.048 


Protein name 








Locus Name 


Acc# 


FIP2 


gp:AF061034 


| AF061034 


Description 


Homo sapiens 


FIP2 alternatively translated mRNA, complete eels 


1 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probabi 1 i ty 


22457187_i3_5 


| 1304 | 3224 


1 P° 1 


P 


53 | |888 


v.oe-ay 1 
l 


Protein name 








Locus Name 


ACC# 










sp:V882_HAEIM 


P44068 


Description 












HYPOTHETICAL 


PROTEIN HI0882 








1 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . _ Score 
Length 


Probability 


35271883_il_4 


1305 3225 


i r i 


183 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


22848457_t2_3 


| J1306 | |3226 


i p 4 i 


|40b 1 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


228b3376_t3_i> 


| 1307 | 3227 


i 259 i 


|720 | 




Protein name 








Locus Name 


Acc# 



Description 
NO-HIT 



342 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


29296968_c3_9 


| |1308 


| 322a 


77 


p 3 4 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


97667B_cJ_iO 


| |1309 


| |3229 


1 P° 1 


|570 | J265 


|7.3e-23 | 


Protein name 








Locus Name 


Acc# 



sp:PRTR_PSfc!AU 



Q06553 



Description 

TRANSCRIPT I ON REGULATORY PRO T EIN PR TR (PYO£jIN REPRESSOR PROTEIN) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


98907V_tl_l 


| |1310 


| |3230 


1 1 


|369 




Protein name 








Locus Name 


Acc# 


Description 












WO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|1062b7b_c3_34 


1 1" 11 


| |3231 


109 | 


330 | 




Protein name 








Locus Name 


ACC# 


Description 












MO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


iii25280_±3_i3 


| |13i2 


| J3232 


1 l i4 * 1 


|44i | 538 | 


|8.5e-52 | 


Protein name 








Locus Name 


Acc# 



nitU protein nomolog hiojvv 



] bir:«40^4 



C64064 



Description 



343 



ORF Name 
|1297092_cl_20 



NT ID AAID 



TttT 



TTTT 



NT 
Length 
TTI 



AA 
Length 
|335 



Score 



[T7T 



Protein name 



Locus Name 



profcaoie gamma- glutamyl transpeptiaase 
precursor 



pir :E70682 



Probability 
|3.9e -12 

Acc# 
E70682 



Description 

ORF Name 
|I5S92919_c3_3b 
Protein name 



NT ID 



AAID 



][ 



TUT 



NT 
Length 

] 



AA 

_ — _ Score 
Length 



I EZO ED [ 



Probability 
|2.4e-17 



Locus Name 



probable gamma -glutamyl transpeptidase 



pir :T34901 



Acc# 
T34901 



Description 

ORF Name 
|2072 7194_c2_24 
Protein name 

Description 



NT ID AAID 



TJTT 



TUT 



NT 
Length 
ST 



AA 

T — ^ Score 
Length 



TTST 



Probability 
|1.8e-28 



Locus Name 



E 



p:AP017750 



ACC# 
AF017750 



Haemopniius ducreyi cytochrome C-type biogenesis protein 
(ccmH) , recombinational DNA repair protein (recR) , manganese 
superoxidedismutase (sodA) , and CitG protein homolog (citG) genes, 
completecds . 



ORF Name 
|2i67902i>_fTH~ 



NTID 



AAID 



TUT 



NT AA 
— — Score 

Length Length 



] eed 



Probability 
|2.0e-155 



Protein name 



Description 



Locus Name 



sp:NIF5_EC0LI 



ACC# 

P39171:P76 
581:P76992 



1 NIPS PROTEIN HOMOLOG 


ORF Name NTID 


AAID 


NT 
Length 


AA 

t — n Score 
Length 


Probability 


pi797152_r3_14 | 1317 


| |3237 


185 | 


224 


1.6e-18 | 



Protein name 

Description 
CHAPERONE PROTEIN HSCB (HSC20) 



Locus Name 



sp:HSCB_ECOLI 



Acc# 
P36540 



344 



ORF Name 
|32244203_c2_26 



NTID AAID 



NT AA 
— — Score 
Length Length 



|131fl | [3238 | |72 | 



ITT 



Probability 
5.0e-08 



Protein name 



Description 



Locus Name 



El 



VCH231122 



Acc# 
AJ231122 



Vibrio cnolerae 


z61t gene. 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|33398287_t2_5 


|1319 


3239 


1 17S 1 


|537 | 395 | 


|9.6e-37 


Protein name 








Locus Name 


Acc# 



Description 



sp:YFHP_HAEIN 



P44675 



HYPOTHETICAL 


PROTEIN HI0379 














ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|36129678J:i_2 


1320 3240 


|112 


339 384 


1.8e 


1 


Protein name 








Locus Name 




Acc# 










sp:YPHE_HAEIM 




| P44672 


Description 














HYPOTHETICAL 


PROTEIN HI 03 76 














ORF Name 


NTID AAID 


NT 
Length 


AA 

- — . , Score 
Length 


Probability 


|36220382_c2_25 


| 1321 |324i 


IP 5 


|363 | 174 


|3.2e 


I 


Protein name 








Locus Name 




Acc# 










sp:GGT_PIG 




j P20735 


Description 














GLUTAMYLTRANS FERAS E ) ( GGT ) 




ORF Name 


NTID AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


|4331938_t2_9 


| 1322 3242 


522 | 


1869 1435 1 


7.6e 


1 


Protein name 








Locus Name 




Acc# 










sp:HSCA_HAEIN 


P44669 



Description 
CHAPERONE PROTEIN HSCA (HSCS6) 



345 



ORF Name 



14332838 Tl 10 



NT ID 



11323 



AAID 



TZZT 



NT AA 
Length Length 
TTB — 



Score 



Probability 
7.5e-37 



Protein name 



Locus Name 



terredoxin 



E 



:AF095864 



Acc# 
AF096864 



Description 



Pseudomonas aeruginosa neat snocJc protein (nscB) , neat snocKprotein 66-KDa 
(hscA) , ferredoxin (fdx) , and nucleoside diphosphatekinase (ndk) genes, 
complete cds. 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — Score 
Length 


Probability 


|5898b93_c2_28 


| |1324 


| | 32 44 


1 I 119 1 


360 






Protein name 








Locus Name 


Acc# 


Description 














[NO-HIT [ 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


7070215_c2_27 


i 


| 3245 


161 


I 486 1 


351 


5.6e-32 



Protein name 



Locus Name 



put at i ve gamma -glutamyl t r anspep t Idas e 
precursor 



gp:P5T249741 



Acc# 
AJ249741 



Description 



Pseudomonas stutzeri J1VI3 00 gacs 


(partial) and ggtB (partial) 


genes. j 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|129i5808J:3_10 


|1325 | 3245 


200 


P 3 1 




Protein name 






Locus Name 


Acc# 


Description 










[NO-HIT | 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


20737503_t3J} 


| |1327 | [3247 


| 371 


1115 [418 | 


|4.5e-39 



Protein name 



Locus Name 



probable permease perM nomolog (perM) RP63 0 I bir :E71558 



Acc# 
E71668 



Description 



346 



ORF Name 



NTID 



AAID 



122000293 c2 13 



NT AA 
— — Score 

Length Length 

?7 1 |2^r 



Probability 
1.2e-3I 



Protein name 



Locus Name 



SOS nbosomal protein nomolog 



|gp:AP153712 



Acc# 
AF153712 



Description 



Pseudomonas sp . BG33R strain BG33R 50S nbosomal protein nomologgene , 
complete cds. 



ORF Name 



123863307 ±3 9 



Protein name 



Description 



NTID AAID 



TTIT 



NT AA 
— — Score 

Length Length 



Probability 
|2.4e-i5 — 



Locus Name 



sp:VPGE_HAEIM 



Acc# 
086235 



HYPOTHETICAL PROTEIN HI1225.1 


ORF Name NTID AAID 


NT 
Length 


AA 

. — , , Score 
Length 


Probability 


|24308561_c3_17 | |1330 | |3250 


i p 


|549 j |710 | 


|5.ie-70 | 


Protein name 




Locus Name 


ACC# 


phospnoribosyltormylglycinamidine 
cyclo-ligase, : 5 1 -phosphoribosyl- 5- 
ole synthetase 


aminoimidaz 


pir rAJECPC 




026 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|26277251_c3_18 |133i | |3251 


131 | 


I" 6 1 H M 1 


|4.4e-32 


Protein name 




Locus Name 


ACC# 



sp:PUR5_EC0LI 



P08178 



Description 

( PHOSPHORIBOSYL- AMINOIMIDAZOLE SYNTHETASE) (AIR SYNTHASE) 



ORF Name 



NTID 



AAID 



6142515 c2 14 



TUT 



TZ5T 



NT AA 
Length Length 
TT5 — 



Score 



Probability 
2.9e-35 



Protein name 



Locus Name 



5 1 -phosphor ibosylglycinamide trans tormylase 



|gp:5TU68755 



Acc# 
U68765 



Description 



Salmonella typhimurium 5 ' -phosphoribosylglycmamide trans 1 ormylase (purN) and 
5 ' -phosphoribosyl -5 -aminoimidazole synthetase (purl) genes, complete cds. 



347 



ORF Name 



NTID 



AAID 



10744000 c3 102 



TTJT 



Protein name 



probable Mn transport protein 



Description 



NT 
n 



AA 

— Score 
Length Length 



Probability 
1094 | |1.0e-110 



Locus Name 



pir :G64063 



Acc# 

G64063:C41 
833 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

Length ■ 


Probability 


1181S3i_tl_2 


|1334 


3254 


558 | 


|1677 | J1333 | 


4.9e 


-136 


Protein name 










Locus Name 




Acc# 












sp:60IM_PSEPU 


P25754 


Description 
















60 KD INNER - MEMBRANE PROTEIN 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


13703378_c3_117 


1335 


3255 | 


35 


|288 | 153 


4.7e 


-12 


Protein name 










Locus Name 




Acc# 












sp:YEAQ_ECOLI 


P76246 


Description 
















HYPOTHETICAL 8.7 KD 


PROTEIN IN GAPA- 


•RND INTERGENIC REGION 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|15031513_t3_43 | 


[1336 


| 3255 | 


479 


1440 |1390 | 


|4.5e 


-142 


Protein name 










Locus Name 




Acc# 












sp : THRC_METGL 


P37145 


Description 
















THREONINE SYNTHASE, 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


i5039077_ci_64 


|1337 


|3257 | 


|2SS | 


801 195 


|1.5e- 


-15 


Protein name 










Locus Name 




Acc# 












gp : DNINTREG 


X98546 



Description 

D.nodosus intB,. regA, gepA, gepB, and gepC genes . 



348 



ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . . Score 
Length 


Probability 


115555903 cl 59 


|1338 | |3258 


281 


846 |1074 | 


|1.4e 


-108 


Protein name 








Locus Name 




Acc# 










sp:Y360_HAEIN 


P44661 


Description 
















HYPOTHETICAL 


PROTEIN HI0360 














ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


[1579882i>_clJ>b 


| |1339 | 3259 


1 165 1 


498 | 






Protein name 








Locus Name 




Acc# 


Description 
















MO-HIT 




ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


19538327_cl_72 


| |i340 | 3250 


1 213 1 


|642 | 218 


|7.0e 


-18 


Protein name 








Locus Name 




Acc# 










sp:Y882_METJA 


Q58292 


Description 














HYPOTHETICAL. 


PROTEIN MJ0882 










i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


197188_r3_42 


| |1341 | 3261 


1 345 1 


1038 | 659 


Im- 


-64 


Protein name 








Locus Name 




ACC# 










sp:FMT_PSEAE 


085732 


Description 














METHIONYL-TRNA FORMYLTRANSFERASE , 




ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


probability 


20197175_r3_44 


| |1342 | 3262 


1 415 1 


1248 | |453 | 


|5.0e- 


-47 | 


Protein name 








Locus Name 




Acc# 










sp:SMF_HAEIN 


P43862 



Description 

I SMP PROTEIN USSR PROCESSING CHAIN A) 



349 



NT ID AAID 

i 



ORF Name 
p3440fl86_t2_27 
Protein name 



Description 
HYPOTHETICAL PRO T EIN MJOfeVtt 



TZUT 



NT AA 
Length Length 
11 809 | 



Score 



[602 



2n~ 



Probability 
2.1e-19 



Locus Name 



sp:Y578_METJA 



Acc# 
Q58091 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|241290_t2_3i 


|1344 


|3264 


1 1" 1 


I 198 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


24244010_c3_I06 


| |1345 


| |3265 


p | 


|252 |69 | 


|0.042 


Protein name 








Locus Name 


Acc# 


nypotnetical protein YlObCbB.x 




pir :T26400 


T26400 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — , , Score 
Length 


Probability 


|24253i97_c3_I07 


i i im 


| |3266 


66 j 


poi | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|242E>b262_c2_96 


|1347 


|3267 


1 1 


|1293 | |B73 | 


2.7e-87 | 


Protein name 








Locus Name 


Acc# 


conserved nypothet 


ical protein 




pir:C75339 


C75339 



Description 



350 



ORF Name 



NT ID AAID 



AA 

— Score 
Length Length 



24256550 t3 40 



][ 



3T51T 



NT 
in 
IFF 



] F^n 



Probability 
4.7e-44 



Protein name 



Locus Name 



sp:YBAD_ECOLI 



ACC# 
P25538 



Description 

HYPOTHETICAL 1 7 .2 KB PRO T EIN IN TSX-RIBG INTEBSENIC REGION (0RE1) 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


p4337786_t3_4B 


1349 


j 3269 


i p 11 


|936 | 657 | 


2.1e-64 | 


Protein name 








Locus Name 


Acc# 










sp:ARGI_BRUAB 


Q59174 


Description 












ARGINA3E, | 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


24417752J:i_iB 


1350 


| 3270 


|62 


l"» 1 I 74 1 


|0.030 


Protein name 








Locus Name 


Acc# 



sp: F MiA_S E RMA 



P22595 



Description 
T YPE-i F1MBR1AL PROTEIN SUBUNI T PRECURSOR 



ORF Name 



NT ID AAID 



24489626 12 21 



T3"5T~ 



TZTT 



NT AA 
Length Length 

T73 



Score 



Protein name 



Description 



[1425 | [1058 | 
Locus Name 



Probability 
6.8e-107 



sp:THDF_PSEPU 



Acc# 
P25755 



POSSIBLE THIOPHENE AND FURAN OXIDATION PROTEIN THDF 


NT AA 

ORF Name NT ID AAID , — L1 „ — ^ 

Length Length 


Score 


Probability 


24643777_i2_22 | |1352 | |3272 352 | |1059 | 


682 


|4.3e-78 | 



Protein name 

Description 
RIBOFLAVIN-SPECIFIC DEAMINASE, 



Locus Name 



sp:RIBD_ECOLI 



Acc# 
P25539 



351 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 




T353 


| |32 73 


1 1 | 


|13B3 | |695 | 


2 . Oe 


-68 I 


Protein name 










Locus Name 




Acc# 












sp:SUN_HAEIN 


P44788 


Description 
















SUN PROTEIN (PMU 


PROTEIN) 
















ORF Name 


NT ID 


AAID 


NT 
Length 


, — . , Score 
Length 


Probability 


|26603b62_c2_86 


| 1354 


| |3274 


| 303 | 


|912 | |1047 | 


|9.9e- 


-106 | 


Protein name 










Locus Name 




Acc# 



sp:FECE_HAEIN 



P44662 



Description 

IRON (111) DIOITRATE T RANSPORT ATP-BINDING PRO TEIN FECE H0M0L0G 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . . Score 
Length 


Probability 


|2738783_t2__37 


|1355 | [3275 


ir i 


183 




Protein name 






Locus Name 


Acc# 


Description 










MO-HIT | 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


2750262_ti_l 


1355 | |3276 


i r i 


|312 | |193 | 


|3.1e-15 | 



Protein name 



Locus Name 



nypotnetical protein SCH24 . 04 



pir:T36559 



Acc# 
T36569 



Description 



ORF Name 



29B390ib cl i>2 



NTID 
"| [1357 



AAID 



TT7T 



NT 
n 



AA 

— Score 
Length Length 



^5" 



Probability 
2.3e-65 



Protein name 



Locus Name 



sp:YDHH_E00LI 



ACC# 
P77570 



Description 

HYPO T HE T ICAL 39. b KB PROTEIN IN PDXH -5LYB INTERGENIC REGION 



352 



ORF Name 



1 30 7 39700 il 7 



Protein name 



NT ID 



AAID 



rrrnr 



NT AA 
Length Length 

]ee: 



— , Score 



Probability 
i.2e-20 



Locus Name 



|Sp:YRDC_ECQLI 



ACC# 
P45748 



Description 

HYPOTHETICAL 20. a KB PROTEIN IN AkOE-SMG INTE RGENIC REGION 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length - 


Probability 


3408BI43_i3_55 


| |1359 


| |3279 


1 I 106 1 


i 321 i 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability- 


391321b_t2_26 


| |I360 


| 3280 


1 Ub 1 


|498 | 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


|3939063_t2_23 


|I36i 


[3281 


225 


|578 | 519 


8.8e 


-SO 


Protein name 








Locus Name 




Acc# 










|sp:RISA_PHOPO 




P51961 


Description 














RIBOFLAVIN SYNTHASE ALPHA CHAIN, 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


3942213_c2_97 


| 1362 


| 3282 


1 "V | 


1104 | |930 | 


2.5e 


-93 


Protein name 








Locus Name 




ACC# 



Description 



sp:UCH2_PH0Lt! 



Q02008 



353 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


4785911 £2 29 


[TT63 


1 13283 


435 


1308 1321 


9.le 


-135 | 


Protein name 










Locus Name 




Acc# 












sp:0AT_DR0AN 




1 P49724 


Description 
















ACID AMINOTRANSFERASE ) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


5214052J:3_!>3 




| P 84 | 


|411 | 


|1235 | |1096 | 


|6.4e 


-111 


Protein name 










Locus Name 




Acc# 












sp:SYY_HAEIN 




P43836 


Description 
















TYROSYL-TRNA SYNTHETASE, 


( TYROS IME- 


-TRNA LIGASE) (TYRRS) 




1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


5282562_cl_63 


i \ tM 


| 3285 


i" 5 i 


|798 | 593 


|3.2e 


-68 | 


Protein name 










Locus Name 




Acc# 


nypotneticai protein jnpui^u 




pir:B71947 




B71947 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|6070i66J:2_20 


| |1366 


| |328£ | 


|V1 | 


1216 1 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


5147028_ci_60 


j |I3S7 


| 3287 | 


292 | 


|879 | |870 | 


5.6e- 


-87 j 


Protein name 










Locus Name 




ACC# 












sp : YFED_YERPE 


Q56955 



Description 

I CH E LAT E D IRON TRANSPOR T SYS T EM MEMBRAN E PRO T EIN V FE D 



354 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Probability 


|839752_tl_19 


| |1368 


3288 




I 183 1 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|867183_ci_68 


| (1369 


| 3289 


i 128 i 


|387 | |107 | 


|2.7e-05 


Protein name 








Locus Name 


ACC# 



sp:YRAM_BAC5U 



007931 



Description 

HYPOTHETICAL 39.5 KD PROTEIN IN SIGZ-C5N INTERGENIC REGION 



ORF Name 



1197077 t3 44 



NTID 
"j [1370 



AAID 



NT 

in 



AA 

— Score 
Length Length 

|T7B~ 



TTT5" 



Probability 
|8.2e-li 



Protein name 



Locus Name 



hypothetical protein TM0342 



pir:D72388 



Acc# 
D72388 



Description 



ORF Name 



114641008 t3 46 



NTID 
] [1371 



AAID 



NT 
in 
E7T 



AA 

— Score 
Length Length 



] EO ED t 



Probability 
5.2e-33 



Protein name 



Locus Name 



putative tmoi rdisuitide interchange protein | |gp : AF057031 



ACC# 
AF057031 



Description 



Pseudomonas aeruginosa putative thiol rdisuitide mtercnange proteinprecursor 
(dsbC) gene, complete cds. 



ORF Name 
|15058126_£1~T 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



TTTT 



TF3" 



Probability 
3.6e-14 



Protein name 



Locus Name 



hypothetical protein 



|gp:AF0888B7 



Acc# 
AF088857 



Description 



vogeseiia indigotera mdigoidme oiosyntnesis regulatory locus, complete 
sequence . 



355 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 






1 132 93 


I 89 1 

II 1 


|270 | |350 | 


|7.2e 


-32 


Protein name 










Locus Name 




Acc# 












sp:IMDH_ACICA 


P31002 


Description 
















DEHYDROGENASE ) 


(IMPDH) (IMPD) 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


169y2775_t2_22 


| |1374 


| |3294 


1 1 61 i 


|I86 | |85 | 


|0. 00086 


Protein name 










Locus Name 




Acc# 


gamma -carboxymuconolac tone 


decarboxylase 


I pir:B69i24 




B69129 


Description 
















ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


20353465_r2_iii 


| |1375 


| J3295 


1 1 1 " i 


504 | 






Protein name 










Locus Name 




Acc# 


Description 
















MO-HIT 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


|20734687_ca_7« 


1 l im 


| |3296 


1 P v 1 


|894 | 642 | 


|8.2e 


1 


Protein name 










Locus Name 




Acc# 












sp : YAAJ_HAE IN 




P44555 


Description 
















HYPOTHETICAL PROTEIN HI0183 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|21572011_cl_b4 


| |1377 


| |3297 


ii" i 


|m | |u | 


(0.0095 | 


Protein name 










Locus Name 




Acc# 












sp:YV10_Mt!TJA 




Q60309 



Description 
HYPOTHETICAL PKOTETH MJ E CS10 



356 



ORF Name 



NTID 



AAID 



][ 



NT AA 

— — Score 

Length Length 

T77 



Probability 
0.0035 



Protein name 



Locus Name 



ORF ivisvu^b nypotnetical protein 



AF063866 



Acc# 
AF063866 



Description 

| Meianopius sanguinipes entomopoxvirus, complete genome. 



ORF Name 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



12347156 tl 8 



Protein name 



[ 13 7 9 | [ 3299 | [1105 | [3318 | p?T 

Locus Name 



Probability 
12 . Oe-286 



isoleucine--tRNA JLigase, : isoleucyl-tRNA 
synthetase 



bir:SYECTT 



Description 



ORF Name 



Acc# 

B64723:S40 
549:A94277 
:A91325 :A9 



NTID 



AAID 



23652183 cl 56 



NT 
n 
777 



AA 

— Score 
Length Length 



T5S5" 



Probability 
uTu 



Protein name 



Locus Name 



outer membrane protein CopB 



gp:U69981 



Acc# 
U69981 



Description 



Moraxella catarrhalis strain Ol2E outer memJDrane protein CopB gene, complete 
cds . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|23865650_c2_77 


(1381 


| |3301 


1 I s7 1 


P 64 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


25506316_tl_14 


1382 


| |3302 


228 


|687 | 554 


|1.7e-53 


Protein name 








Locus Name 


Acc# 



sp : YIHA_E00LI 



Description 

HYPOTHETICAL GTP-BINDlMi PROT E IN IN POLA-HUMN 1M T ERGENIC REGION 



P24253 :P76 
771 



357 



ORF Name 
|2S84717_tTTT 



NTID AAID 



TJXT 



JJUT 



NT 
Length 
Wl 



AA 

t — ^ Score 
Length 



TTT 



Probability 
|3.1e-08 



Protein name 



Locus Name 



gamma -carboxymuconol act one decarboxylase 



pir:B69129 



Acc# 
B69129 



Description 



ORF Name 



NTID AAID 



NT 
Length 



| 2S942i37_t2_29 | [1384 | [3304 | [ 



185 



Protein name 



Description 



AA 

, — Ll Score Probability 
Length 

|558 | |295 | p.Be-26 

Locus Name Acc# 

P21863 



sp:FKBX_PSEFL 



{EC 5.2.1.8) (PPIASE) (ROTAMASE) 1 


ORF Name NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|26364431_ci_49 1385 3305 


117 


I 354 1 


poo | 


|i.4e-26 | 



Protein name 



Description 



Locus Name 



binFEKRV 



Acc# 

S72167 :S78 
121:A00210 



ORF Name 



132056506 c3 81 



Protein name 



Description 



NTID AAID 



NT AA 
— — Score 
Length Length 



Probability 



3306 | |401 | [1206 | [i486 | |3.0e-152 



Locus Name 



sp:IMDH_ACICA 



Acc# 
P31002 



DEHYDROGENASE) 


(IMPDHJ (IMPD) 






i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


32477250_Cl_65 


1387 3307 


443 


1332 1422 


I.ae-i4b 


Protein name 






Locus Name 


Acc# 



sp : YCDG_ECOLI 



P75892 



Description 

HYPO T HETICAL 48.1 KB PRO T EIN IN WRBA-PU T A INTERGENIC REGION 



358 



ORF Name 



NTID AAID 



AA 

— Score 
Length Length 



134655952 £2 28 



11388 



NT 

i? 
T77 



Probability 



|i>34 | [573— | | 4.3e-34 



Protein name 



Description 



Locus Name 
sp:LSPA_£>SEFL 



Acc# 
P17942 



PEPTIDASE) (SIGNAL PEPTIDASE II) 


{5PA5E II) 






ORF Name NTID AAID 


NT 
Length 


AA 

„ — ^, Score 
Length 


Probability 


4775V62_tl_15 | |1389 | |3309 


| 252 


| |759 | |593 


1 M^ 1 ' 1 


Protein name 




Locus Name 


Acc# 






sp : YRAL_ECOLI I P45528 


Description 








HYPOTHETICAL 31.3 KD PROTEIN IN AGAI-MTR INTERGENIC REGION 


(P28S) 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — Score 
Length 


Probability 


|5910313J:2_30 1390 | 3310 


392 


1179 895 


| |1.3e-89 


Protein name 




Locus Name 


Acc# 


nomosenne O-acetyltransterase 




gp : LMMETYX 


| Y10744 


Description 








L.meyeri metY and metx genes. 








ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


597S592_12_4i | |139i | 3311 


i i ib2 


|459 | |276 


| |5.0e-24 j 


Protein name 




Locus Name 


ACC# 


Lportx 




Jgp:LPU63641 


U63641 


Description 








Legionella pneumopnila rpoD operon Lportx, 
cds . 


LpcLnaG, and LprpoDgenes, complete 




ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


818765_ti_7 |1392 | |3312 


i «• i 


|297 | 




Protein name 




Locus Name 


ACC# 



Description 
NO-HIT 



359 



ORF Name 



|976b«32 Tl 38 



Protein name 



NT ID 



AAID 



TTTT 



JTTT 



NT AA 
— , — , Score 
Length Length 

13 77 I [TTTTT 



Locus Name 



homosenne dehydrogenase 



|gp:L 7 8^b 



Probability 

|1.5e-IlI 

Acc# 
I L78665 



Description 



Methylobacillus tlagellatum aspartate aminotransferase (aatj ,memDrane 
protein (orf-1) , homoserine dehydrogenase (horn) , andthreonine synthase (thrC) 
thymidylate sythase (thyA) genes , complete cds . 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Probability 




9773436_jt2_31 


1 1394 


| 3314 


| 215 


P y 1 I 117 1 


|0. 00011 | 


Protein name 








Locus Name 


Acc# 


probable 24-sterol 


c-metny±transterase, 


pir :T03845 


T03845 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|10423125_c2_44 


| |1395 


| pis 


124 


|375 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|1069202_t2_li 


| |139S 


|pu 


i p i 


i 196 i 




Protein name 








Locus Name 


Acc# 


Description 












|NO-HIT 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


12933427_t2_5 


1397 


3317 


| 131 


|396 | |239 


|4. le-20 | 


Protein name 








Locus Name 


Acc# 



sp:DHSC_EC0LI 



P10446 



Description 

SUCCINA T E DEHYDROG E NAS E CYTOCHROME B-555 5UBUNI T 



360 



ORF Name 



NT ID AAID 



120330461 Tl 19 



Trnr 



NT AA 
Length Length 
TT9 



690 



Score Probability 
17 14 | |1.9e-70 — 



Protein name 



Locus Name 



tumarate reductase tlavoprotem summit 



|gp:Afc015757 



Acc# 
AB015757 



Description 



Rhodoterax termentans genes tor tumarate reductase summits , complete cds , 



ORF Name 



1214128 Tl 9 



NTID 
[TUTS" 



AAID 



] [1399 | [3319 | 



NT AA 

— , — , Score 

Length Length 

] [2292 | 



757 



Probability 
l.le-266 



Protein name 



Description 



Locus Name 



sp:0D0i_AZ0VI 



Acc# 
P20707 



KETOSLUTABATK IMfHV BROKEN 



ORF Name 



121501557 t3 27 



Protein name 
Description 



NTID 



AAID 



TTtfTT 



NT AA 
Length Length 

] f i F^n 



— . , Score Probability 



Locus Name 



Acc# 



[NO-HIT 



ORF Name 



121510931 r2 b 



NTID 
] [1401 



AAID 



Score 



Probability 



][ 



NT AA 
Length Length 
[381 | [1145 | |144b | |5.6e-148 



Protein name 



Locus Name 



tumarate reductase tlavoprotem summit 



gp:AB015757 



Acc# 
AB015757 



Description 



Rnodoterax termentans genes tor tumarate reductase summits , complete cds. 



ORF Name 



123469010 ta 2b 



Protein name 
Description 



NTID AAID 



TTZT 



NT AA 
Length Length 

] f i f*~i 



Score Probability 



Locus Name 



Acc# 



INO-HTT 



361 



ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


Z J O J J UD / V_j ZD J 1 J J J -J 


|183 


1552 1 176 1 

|552 | |76 | 


|0 . 018 1 


Protein name 




Locus Name 


ACC# 


putative adnesm MAAi 




gp:AF154922 


! AF154922 


Description 








Mycoplasma arthntidis strain 158 putative adnesm MAAl (maal)gene, complete 




cds . 










ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|242414S3_r2_10 1 11404 1 13324 1 


58 


1 1207 1 131 1 


|2.5e-07 1 
1 1 


protein name 




Locus Name 


ACC# 






sp:0D01_HAEIN 




Description 








KETOGLUTARATE DEHYDROGENASE) 








ORF Name NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


24251441J:i_4 11405 1 3325 


375 


11128 1 123 


|5.Se-05 | 


Protein name 




Locus Name 


Acc# 


heme receptor 




1 gp:VIBHUTA 


L27149 


Description 








Vibrio cholerae heme receptor (hutA) 


gene, 


complete cds. 






NT 
Length 


AA 

, — . , Score 
Length 


Probability 


244272<52_t2_8 1406 | [3326 


123 


| |372 234 


|2.6e-18 | 


Protein name 




Locus Name 


ACC# 


alpha-ketogiutarate dehydrogenase 




gp:AF068740 


| AF068740 


Description 








Pseudomonas putida dihydrolipoamide 


succinyl transferase' (kgdB) 




andalpha-ketoglutarate dehydrogenase 


(kgdA) 


genes, complete cds. 




ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


263S7i92_ti_2 |1407 | 3327 | 


455 


|1467 | |140i| 


|3.0e-143 | 


Protein name 




Locus Name 


Acc# 


dihydrolipoamide dehydrogenase 




gp : PSELPDA 


M28356 


Description 









P. tluorescens cUnyctrolipoamiae aenydrogenase (ipd) gene, completecds . 



362 



1 



ORF Name NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|25377042_r2_7 | 1408 | (3328 


| uu | 


|579 | |790 | 


|1. 7e-78 


Protein name 




Locus Name 


ACC# 


succinate dehydrogenase putative iron 


gp:SPSDH 


Y13760 


sulphur 














Description 








snewanella trigidimarina NC1MB4 0 0 


sdhA, sdhB, 


sdhC, sdnD and 


sucAgenes . 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|3I43937SJ:i_l | |1409 | 3329 


1 ^ 1 


405 275 " 


16 3e-24 


Protein name 




Locus Name 


ACC# 






sp:DHSD_ECOLI 


P10445 


Description 








SUCCINATE DEHYDROGENASE HYDROPHOBIC MEMBRANE 


ANCHOR PROTEIN 




ORF Name NTID AAID 


NT 
Length 


— Score 
Length 


Probability 


|4064425_r3_26 | |1410 | 3330 


i w » i 


11800 1 1123 1 


18 le-05 


Protein name 




Locus Name 


Acc# 






sp:F0XA_5ALTY 


Q56145 


Description 








FERRIOXAMINE B RECEPTOR PRECURSOR 


(FRAGMENT) 






ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|5B4625_i2_iI | |14il | |3331 




|1263 |1194 | 


|2.6e-12i 



Protein name 



Locus Name 



dihydrolipoamide — — — — 
S- succinyl transferase , : 2-oxogluturate 
dehydrogenase complex chain E2: succinyl 



pir :S07779 



Acc# 

S07779:S63 
511 



Description 



363 



ORF Name 



NT ID AAID 



9928130 ci 34 



11412 



TTZT 



NT AA 
Length Length 

in — 



Score 



TTS" 



Protein name 



Locus Name 



microtilarial sneatn protein SHP3 precursor 



|gp:AF030944 



Description 



Probability 
8.6e-07 



Acc# 

AF030944 :U 
43510 



Brugia malayi microtilarial sneath protein SHP3a (Bmshp3a) andmicrotilarial 
sheath protein SHP3 precursor (Bmshp3) genes , complete cds . 



ORF Name 



NT ID AAID 



AA 

t — ^ Score 
Length Length 



12619081 c3 114 



T3TT" 



TTTT 



NT 
n 
TT7 



] i 



Probability 
|1.7e-i6 



Protein name 



Description 



Locus Name 



sp:YBAN_ECOLI 



Acc# 

P45808:P77 
478 



HYPOTHETICAL 14 


8 KD PROTEIN IN PRIC 


-APT INTERGENIC REGION 


i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


12897562_cl_73 


1414 


|3334 | 


78 | 


i" 7 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


135977S_c2_91 


1 1 1415 


| 3335 | 


" i 


p64 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Probability 


14064028_C2_105 


1416 


| 3336 


269 


810 | |182 | 


|4.Se-14 


Protein name 








Locus Name 


ACC# 



Description 
HYPOTHETICAL '21. 



sp:YEAB_ECOLI 



P43337 



4 KD PROTEIN IN PABK-SDAA INTERGENIC REGION 



364 



ORF Name 



NT ID AAID 



14657782 c2 104 



T3T7" 



NT 
Length 
T71 



AA 



Length Score Probability 



TIT 



TTT 



4.5e-30 



Protein name 



Locus Name 



sp:BID2_HAEIN 



Acc# 
P45248 



Description 
2) IDTB SYNTHETASE 2) {DTBS 2) 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


14719437_tl_22 


1418 


ip 338 


m i 


i 152 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


I4882713_c3_iI6 


| 1419 


p39 


287 | 


|964 | 289 | 


|8.ie-30 | 



Protein name 



Description 



Locus Name 



sp:BIOC_HAEIN 



ACC# 
P45249 



PUTATIVE BIOTIN 


SYNTHESIS 


PROTEIN BIOC 








ORF Name 


NTID 


NT 
Length 


AA 
Length 


Score 


Probability 


|ifi464750_c2_86 


J1420 


| |3340 |325 


r 1 i 


P 3 1 


|&.0e-05 | 



Protein name 



Description 



Locus Name 



sp:ZIPA_ECOLI 



Acc# 
P77173 



CELL DIVISION 


PROTEIN ZIPA 










ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


I6532256_tI_I 


| 1421 | |3341 


I 80 


I 243 1 


,95 | 


|0.00iS 



Protein name 



Locus Name 



ubiquitin protein ligase 



pir :T39585 



Acc# 
T39585 



Description 



365 



ORF Name 


NT ID 


AAID 


NT 
Length 


, — , , Score 
Length 


Probability 


19572130 13 58 


1 11422 
1 1 


| 3342 


310 


933 11010 1 


|8 . 2e 


-102 | 


Protein name 








Locus Name 




ACC# 










sp:CVSM_ECOLI 




P16703 


Description 














(0-ACETYL5ERINE 


(THIOL) - LYASE BJ 


{C5A5E B) 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length - ■ ■■■ 


Probability 


|19734630_t2_40 


JI423 


i p 343 


r j i 


|1S02 | |400 | 


|3 .le 


1 


Protein name 








Locus Name 




Acc# 



sp:Yaea,_HAEiM 



P44643 



Description 
HYPO T HE T ICAL RNA ME T HVL T RANSPBRAijlj H1U33J, 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


20507762_ti_13 


[1424 


|3344 


1 I 290 1 


|873 | 548 


7.5e-53 | 


Protein name 








Locus Name 


ACC# 










|sp:DPSD_ECOLI 


P10740 


Description 












PH05PHATIDYLSER1NE 


DECARBOXYLASE 


PROENZYME, 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


20839062_c2_92 


| |14 25 


|3345 


|444 


|1335 | 482 | 


i.3e-48 


Protein name 








Locus Name 


Acc# 



Description 



sp : DEAD_HAEIN 



P44586 



ATP - DE PENDENT 


RNA HELICASE DEAD H0M0L0G 






ORF Name 


NT 

NTID AAID , — . , 
Length 


AA 

_ — . , Score 
Length 


Probability 


22144026_il_26 


| 1425 |3346 |284 | 


|855 | 443 | 


|4.3e-41 


Protein name 




Locus Name 


Acc# 



sp :RELA_HAEIN 



P44644 



Description 
(PPGPP SYNTHETASE I) 



366 



ORF Name 



NT ID 



AAID 



22147806 F3 47 



TJ5T* 



NT AA 
Length Length 

wtl 1 nnzr 



Score 
1855 | 



Probability 
|1.5e-86 



Protein name 



Description 



Locus Name 



sp:YSIC_ECOLI 



Acc# 
P24196 



(0385) 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|2269I300_c2__99 


1428 | 3348 


I 6 " 1 


|1839 | |588 


| |7.1e-72 | 


Protein name 






Locus Name 


ACC# 


sensor Kinase rtpA 






|gp:AB002529 


AB002529 


Description 


pseuaomonas toiaasn gene tor sensor 


Kinase 


rtpA, complete 


cds . 


ORF Name 


NT ID AAID 


NT 

Length 


AA 

— , Score 
Length 


Probability 


22890917_t2_33 


|1429 | 3349 


258 


|777 | 195 


| I.5e-15 


Protein name 






Locus Name 


Acc# 



Description 



|sp:YBEN_ECOLI 



P52085 



HYPOTHETICAL 24 


5 KD PROTEIN IN PHPB 


-HOLA INTERUENIC kUUlON 


(ORPUU) 


ORF Name 


NT ID AAID 


NT AA 
— , — , Score 
Length Length 


Probability 


|23478458_c2_10^ 


1430 | 3350 


441 | [1325 | 1326 | 


|1.7e-135 



Protein name 



Locus Name 



BlOA 



gp:AP191555 



* Acc# 
AF191556 



Description 



xenornaJDCius nematopnilus YfchE (yfcnE) gene, partial cds; van 
(bioA) genes, complete cds; and unknown gene. 


(varl)and BioA 


ORF Name 


NT AA 
NT ID AAID — , _ — , . Score 
Length Length 


Probability 


24097812_Cl_83 


1431 |3351 |208 | |527 | 452 | 


|9.7e-44 



Protein name 



Locus Name 



sp:£SU_HAElN 



Acc# 
P44409 



Description 

5 I NGLE -STRAND BINDING PROTEIN (SSB) (HELIX-DESTABILIZING PROTEIN) 



367 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability- 


24225088_r2_39 


|1432 


|3352 


P 


|186 








Protein name 


* 






Locus Name 




Acc# 


Description 
















NO-HIT 1 


OPT? Mamp 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|25587827_cl_79 


| 1433 


| piu 


| |400 


I po 3 


I 888 1 


|7.0e 


-89 | 


Protein name 








Locus Name 




Acc# 










sp:BIOF_HAEIN 




P44422 


Description 
















LTGA5E) 


















MTTD 

IN 1 1JJ 




NT 
Length 


AA 
Length 


Score 


Probability 


|26i92I60_t3_64 


| |1434 


| 3354 




| 1599 


894 | 


Im- 


-89 


Protein name 








Locus Name 




Acc# 










sp:REIA_ECOLI 




P11585 


Description 
















IPPGPP SYNTHETASE 


I) 














i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


probability 


29298385_ti_8 


1 I 1435 


| p 355 






peo 1 


|4.7e- 


1 


Protein name 








Locus Name 




ACC# 










|gp:U90439 






Description 














U90439:AE0 
02093 


Arabidopsis thaliana chromosome II 


section 


227 of 255 of thecomplete 




sequence . 


















ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|33708181_ci_75 


| 1435 


| 3355 


| 4ii 


|1236 


391 1 


3.2e- 


■36 


Protein name 








Locus Name 




Acc# 


putative Justidine 


Kinase 






|gp:PST24974i 




AJ249741 


Description 

















Pseudomonas stutzeri JM300 gacS (partial) and ggtB (partial) genes. 



368 



ORF Name 



NTID AAID 



NT AA 
— — Score 
Length Length 



337282b8 C3 117 



TTT 



IB5" 



Probability 
0.0019 



Protein name 



Description 



Locus Name 



sp:BID2_HAEIH 



Acc# 
P45248 



2) (DTB SYNTHETASE 


2) (DTBS 2) 








ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|35173953_ri_7 


| |1438 | |3358 


152 


{459 | 237 | 


6.8e-20 


Protein name 






Locus Name 


Acc# 



Description 



sp:VBEB_ECOLI 



P05848:P77 
107 



HYPOTHETICAL 11.5 KD PROTEIN IN MRE)A 


-PHPB INTERGENIC REGION 






ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


35183451_t2_38 | |1439 |3359 


251 


|786 | |259 | 


3.1e 


-22 


Protein name 






Locus Name 




Acc# 


hypothetical protein jnp0628 




pir :B71907 


B71907 


Description 












ORF Name NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4147637_c3_120 1440 3360 


990 


|2973 | |328i| 


jO.O 




Protein name 






Locus Name 




Acc# 








sp:UVRA_ECOLI 


P07671:P76 
788 


Description 










EXCINUCLEASE ABC - SUB UN IT A 


ORF Name NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|4199006_t3_5<!> |1441 | 3361 


69 | 


pio | p | 


|0.022 


Protein name 






Locus Name 




ACC# 


NADH dehydrogenase subunit 4 




gp:AP026170 


AF026170 



Description 



Teius teyou NADH dehydrogenase summit 4 (ND4) gene, partial cds;and —™ — 
tRNA-His , tRNA-Ser, and tRNA-Leu genes, complete sequence, mitochondrial genes 
for mitochondrial products. 



369 



ORF Name 



NT ID AAID 



14328431 t3 55 



NT AA 
Length Length 
1T75 — 



Score 



Probability 
3.2e-52 



Protein name 



Description 



Locus Name 



sp:FPG_NEIME 



Acc# 
P55044 



GLVC05VLASE) 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4867812_c3_118 


|1443 




i 154 i 




|1.8e-28 | 


Protein name 








Locus Name 


Acc# 



Description 



sp:YIHZ_ECOLI 



P32147 



HYPOTHETICAL 15 


9 KD PROTEIN IN RBN-FDHE InTErgENIc REGION 


(0145) 


ORF Name 


NT AA 
NT ID AAID . — . , . — . , Score 
Length Length 


Probability 


|892177^cl_70 


1444 3364 | |169 |5i0 | 331 


7.4e-30 | 


Protein name 


Locus Name 


Acc# 



Description 



:D8338S 



D83386 



snewanella violacea 
cds . 


rhlE, 


cydD, cycle and put A genes, 


partial 


andcompiete 


ORF Name 


NT ID 


NT AA 

AAID 

Length Length 


Score 


Probability 


16847335J:3_5 


|1445 


|33<55 177 | |534 | 


533 


|7.3e-62 | 



Protein name 



Locus Name 



DNA-directed RNA polymerase alpha chain 



|gp:AP047025 



ACC# 
AF047025 



Description 



Pseudomonas aeruginosa ribosomal protein S4 (rpsD) gene, partialcds; 
DNA-directed RNA polymerase alpha chain (rpoA) , ribosomallarge subunit 
protein L17 (rplQ) , and catalase isozyme A (katA)genes, complete cds; and 
bacteriof erritin (bfr) gene, partial cds. 



370 



ORF Name 



115975442 c3 13 



Protein name 



Description 



NTID 



AAID 



1446 



3355 



NT 
n 
TT5 



AA 

— Score 
Length Length 



Probability 
|8.0e-33 



Locus Name 



sp : YFCM_ECOLI 



ACC# 

P76938:P76 
497 



HYPOTHETICAL 21 


1 KD PROTEIN IN FABB 


-ME PA INTERGENIC 


REGION 




ORF Name 


NTID AAID 


NT AA 
Length Length 


Score 


Probability 


|24226S55_i2_3 


1447 | 3357 | 


155 | |498 | 


499 | 


1.2e-47 | 



Protein name 



Locus Name 



DNA-directed RNA polymerase alpha chain 



|gp:AF047025 



Acc# 
AF047025 



Description 



Pseudomonas aeruginosa riJDosomal protein S4 (rpsD) gene, partialcds; 
DNA-directed RNA polymerase alpha chain (rpoA) , ribosomallarge subunit 
protein L17 (rplQ) , and catalase isozyme A (katA)genes, complete cds; and 
bacteriof erritin (bfr) gene, partial cds. 



ORF Name 



124317501 ±2 4 



NTID 



AAID 



NT AA 
Length Length 
S3 - 



— . , Score 



] |252 | 



Probability 
|2.7e-32 



Protein name 



Description 



Locus Name 



sp:RL17 PSEAE 



Acc# 
052761 



50S RIB0S0MAL PROTEIN L17 


ORF Name NTID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


3001593_t2_2 | 1449 


| 3359 


i p 17 i 


I 654 1 6aa 1 


|3.7e 


1 


Protein name 








Locus Name 




Acc# 


nbosomal protein S4 




pir :A64095 


A64095 


Description 














ORF Name NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4857143_cl_9 | |1450 


| 3370 


i 191 i 


|575 | 314 


4.7e 


-2« | 


Protein name 








Locus Name 




ACC# 


probable translation factor yciO 




pir :F64874 


F64874 



Description 



371 



ORF Name 


NT ID 


AAID 


NT 
Length 


. — . , Score 
Length 


Probability 


60333V7_c3_14 


1451 


3371 


94 


1285 | 184 1 
1 III 




Protein name 








Locus Name 


Acc# 


hypothetical protein C34t , b 


. 9 




pir:T19736 


T19736 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|10437bl7_cl_70 | 


|1452 


| |3372 


P i 


i 189 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|11808575_c2_83 | 


|1453 


| p373 


72 | 


piy | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


1359677__r2_i7 1 


|1454 


i i jiv4 


i p vj 


1122 969 


|i.8e-97 | 


Protein name 








Locus Name 


Acc# 


uroporphyrinogen decarboxylase 




|gp:ECOUW8y 


1 U00006 


Description 


E. coil chromosomal 


region 


trom 89 


2 to 92.8 


minutes . 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


14898317_c3_94 


|14bb 


| |3375 


i i 


1776 | |1769| 


p.ie-182 | 


Protein name 








Locus Name 


Acc# 



Description 



sp : 3YD_E<L'0L1 



P21889 



(ASPRS) 



372 



ORF Name 



16522206 ci 56 



NT ID 
] [1456 



AAID 



AA 

^ T — ^ Score 
Length Length 



3376 



NT 

sn 



] E!ZI 



Protein name 

Description 

jNU-HlT 



Locus Name 



Probability 



Acc# 



ORF Name 
|16614042_c3_107 
Protein name 



NTID AAID 



— Score 
Length Length 



TT5T" 



TTTT 



NT 

|n 

] Ef 



] ED 



Probability 
rar"| |7.9e-10 ~ 



Locus Name 



hypotnetical protein slrl903 



lpir:577514 



Acc# 
S77514 



Description 
ORF Name 



NTID AAID 



NT 



AA 



Length Length 



Score Probability 



175817 t3 32 



Protein name 



] [ 1458 | [3378 | |442 | [1329 | [1020 | |7.2e-103 

Locus Name Acc# 



glyceraldenyde-3-pnospnate aenydrogenase gp : AP058302 



AF058302 



Description 



Streptomyces roseotulvus trenolicin JDiosyntnetic gene cluster , complete 
sequence . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|20984532_cl_68 


1459 


| 3379 


1 60 1 


I 183 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


|2i42i51_t3_38 


|1460 


| 3380 


| 252 


|759 | 421 


|2.ie-39 



Protein name 



Locus Name 



anion transport ABC transporter (ATP-oindij 
homolog ytlC 



pir :C69995 



Acc# 
C69995 



Description 



373 



ORF Name 



123437555 12 24 



Protein name 



NT ID 



AAID 



NT AA 
Length Length 
TPS 1 1104 V 



Score 



Probability 
5.0e-95 



Locus Name 



3-phosphoserine aminotransterase 



bp:AF038578 



Description 



Acc# 

AF038578:M 
73971:M355 
45 



Pseudomonas stutzen gyrase A summit (gyrA) gene, partial 
cds ;3-phosphoserine aminotransferase (serC) , chorismatemutase/prephenate 
dehydratase (aroQp/pheA) , imidazole acetolphosphate aminotransferase (hisHb) , 
and cyclohexadienyldehydrogenase (tyrAc) genes, complete cds; 
and5-enolpyruvylshikmate 3-P synthase (aroF) gene, partial cds. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|23526888_tl_7 


|1462 


|3382 


65 


p8 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|23642875_r3_39 


|1463 


| |3383 


255 


I 768 1 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


„ — . , Score 
Length 


Probability 


|24i03137_t2_15 


1454 


3384 


| 403 


|1230 |1079| 


I4.0e-109 


Protein name 








Locus Name 


Acc# 



sp : YHBZ_HAEIN 



P44915 



Description 

HYPOTHETICAL 43.4 KD GTP- BINDING PROTEIN HI0877 



374 



ORF Name 



124272135 c3 103 



Protein name 



NT ID 



AAID 



NT AA 
Length Length 
T7T 



Score 



] i 



Probability 
|4.8e-26 



Locus Name 



Lrp-tamily transcriptional regulators 



bp:D89015 



Acc# 
D89015 



Description 



Pseuaomonas putida genes tor MdeR,MdeA ana MdeB, complete cds . 



ORF Name 



|24410038_t2_19 | [1455 | [3385 | 



NT ID 



AAID 



NT 
n 



AA 

T — ^ Score 
Length Length 



nrrrr 



Probability 
743 I |1.5e-73 



Protein name 



Locus Name 



proteinase DO 



pir:H71936 



Acc# 
H71936 



Description 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


25562512_c3_98 


|1467 


| J3387 


304 


|915 | 532 


3.7e 


1 


Protein name 










Locus Name 




Acc# 












sp:YJJP_ECOLI 


P39402 


Description 
















HYPOTHETICAL 30 


5 KD PROTEIN IN DNAT 


-BGLJ INTERGENIC REGION 


(P277) 


1 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|25665963_tl_5 


| |1468 


|3388 


264 


|795 | |444 | 


7.8e- 


-42 


Protein name 










Locus Name 




Acc# 












sp:GL02_EC0LI 


Q47677 


Description 
















II) (GLX II) 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|2757750_£3_47 


| |1469 


1 P 389 1 


[72 


P" 1 P 3 1 


10.016 


Protein name 










Locus Name 




ACC# 












gp:AB021078 




AB021078 



Description 



plasmid ColIb-P9 DNA, complete sequence . 



375 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability- 


I31412955_t2_23 


1470 


3390 


250 


I 753 1 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


33394002_t2_30 


i i i4vi 


| |3391 


1 F»« 1 


1524 |79 | 


jU.OJi | 


Protein name 










Locus Name 




Acc# 


cytocnrome-c oxidase, cnain I RP40b i 


pir:D71598 


D71698 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


35181680_C3_95 


1472 


| 3392 


355 1 


|1071 | |257 | 


|4.5e 


1 


Protein name 










Locus Name 




Acc# 










tap:PFY14568 




Y14568 


Description 
















Pseudomonas tluorescens tag gene ana partial glyQ, ntrB genes. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


4009750_ci_59 


| 1473 


|3393 


221 


555 | |279 | 


|2.4e- 


1 


Protein name 










Locus Name 




Acc# 


hypotnetical prote 


m 






pir:S75551 




S76551 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


4165952_t3_34 


| |1474 


| |3394 


IF 1 


P 4y 1 






Protein name 










Locus Name 




ACC# 



Description 



NO -HIT 



ORF Name NTID AAID 


NT 
Length 


AA 

m — . , Score 
Length 


Probability 


|4328443_cl_74 | 1475 | 3395 


176 


531 193 


3 .le-15 


Protein name 






Locus Name 


ACC# 


hypotnetical protein 


pir :U75479 


G75479 


Description 










ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


14423193 c2 79 1 11476 1 13396 
1 - - II II 


1 I 85 1 
1 1 1 


|258 | |87 | 


|4.6e-07 


Protein name 






Locus Name 


Acc# 








sp:ARGD_ARCFU 


030156 


Description 










ACETYLORNITHINE AMINOTRANSFERASE, 


(ACOAT) 








ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4864077_c2_78 | |1477 | |3397 


1 w 1 


|192 | |149 


1.4e-10 


Protein name 






Locus Name 


Acc# 


unJcnown 




gp:AF062531 


AF062531 


Description 


Pseuaomonas putida GB-l signal peptidase (pilD) 
unknown genes. 


gene, partial 


cds ; and 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4878407_ti_6 |1478 | |3398 


589 


|1770 | |1355| 


12 . 3e-l38 


Protein name 






Locus Name 


Acc# 








sp:LEUl_YEAST 


P06208 


Description 










SYNTHASE) (ALPHA- I PM SYNTHETASE) 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length - 


Probability 


5085963_ti_il | 1479 | 3399 


| 243 


732 124 


3.2e-13 | 


Protein name 






Locus Name 


Acc# 



sp:VDPN_BACSU 



P96692 



Description 

I PU T A T IV E MAD(P)H NITR0REDUCTA5E VDPN, 



377 



ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


15115943 f2 11 " | 11480 1 13400 


|330 


993 585 


9.0e-57 | 


Protein name 




Locus Name 


Acc# 


nypotnetical protein TM0484 




pir:C72369 


C72369 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|5265588_t2_29 | |1481 3401 


| 878 


|2637 | |2754 


| |1.3e-286 


Protein name 




Locus Name 


ACC# 


UspAl 




gp:AFii360S 


AF113606 


Description 


Moraxeiia catarrnaiis strain ATCC25238 UspAl 


(uspAl) gene, 


completecds . 


ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


802137_£3_37 |1482 3402 


i 241 i 


|786 459 


2.0e-43 


Protein name 




Locus Name 


Acc# 


ABC transporter, permease protein, 
family 


cysTW . 


pir:D72369 


D72369 








Description 








ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|894387_c2_80 |1483 | |3403 


|160 


|483 | |313 


| |6.0e-28 | 


Protein name 




Locus Name 


Acc# 



sp:YJJP_HAEIN 



P44520 



Description 
HYPOTHETICAL PROTEIN HI0108 



ORF Name 



1976558 Tl 18 



NTID 
j [1484 



AAID 



3404 



NT 
Length 
ST 



— . , Score 



AA 
Length 



Protein name 
Description 
[NO-HIT 



Locus Name 



Probability 



Acc# 



378 



ORF Name 



NT ID AAID 



110740 



682 c'2 12 



NT 



AA 

— Score 
Length Length 

F73~ 



3W 



Probability 
|1.3e-66 



Protein name 



Locus Name 



probable acyl-CoA dehydrogenase 



|pir:B75282 



Acc# 
B75282 



Description 

ORF Name 
| 16829202_E Q- 



NTID 
] [1485 



AAID 



][ 



3406 



NT AA 
Length Length 

] EEEI ~ 



Score 



Protein name 

Description 
4 'AMINO- 4 - DE0XYCH0R I 5MA TE LYA&E, (ADC LYASE) 



|753 | |185 | 

Locus Name 



Probability 
2.2e-I4 



sp:PABC_EC0LI 



Acc# 
P28305 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|25421887_ti_2 


|1487 


3407 


188 


p" i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|34I609i8_t3_7 


|1488 


| 3408 


1 " 






Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|636563i_c3_13 


|1489 


| 3409 


275 


|828 | |475 | 


4.1e-45 



Protein name 



Locus Name 



shikimate dehydrogenase 



|gp:NPU82846 



Acc# 
U82846 



Description 



Neisseria pharyngis var. 
cds . 



tlava shikimate dehydrogenase (aroE) gene, complete 



379 



ORF Name 



112156514 cl 16 



Protein name 



Description 



NT ID AAID 



NT 
Length 
T^l 



AA 

— ^ Score Probability 
Length L ~ 



2.2e-30 



Locus Name 



lsp:ft5 T A_EC0LI 



Acc# 
P52108 



TRANSCRIPTIONAL REGULATORY 


PROTEIN RSTA 




i 


ORF Name NT ID 


NT 

AAID — _ 
Length 


AA 

. — . , Score 
Length 


Probability 


15625078_cl_19 | |1491 


3411 | |179 | 


|537 | 444 | 


|7.8e-42 | 


Protein name 




Locus Name 


Acc# 



Description 



sp : TRMD_SERMA 



P36244 



METHVLTRAN5PERA5E ) 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Probability 


23468928_cl_18 


|1492 


3412 


| 191 


|576 | 295 


4.8e-25 


Protein name 








Locus Name 


Acc# 



sp:RIMM_HAEIN 



P44568 



Description 
1<S 5 RRNA PROCESSING PROTEIN RIMM 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


23859377_ti_4 


| 1493 


| 3413 | 


502 | 


|1509 | |525 


|2.0e-50 | 


Protein name 








Locus Name 


ACC# 


EnvZ protein 








|gp:VE0MPR 


Y08950 


Description 


Y. enterocolitica 


ompR and 


envZ genes 








ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


3961587_c2_21 


| 1494 


3414 


85 


|26i |279 | 


|2.4e-24 


Protein name 








Locus Name 


ACC# 



sp:RSI5_HAEIN 



P44382 



Description 
3 OS RIBOSOMAL PROTEIN SIS 



380 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


S54592 c3 22 "| 


1495 


3415 


598 


1797 442 


1.3e-41 


Protein name 










Locus Name 


Acc# 












sp:RSTB_ECOLI 


| P18392 


Description 














SENSOR PROTEIN RSTB, 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — , , Score 
Length 


Probability 


10S7S257_t2_68 | 


|I496 


| pus 


i p" i 


|1182 |1212 | 


|3.2e-123 


Protein name 










Locus Name 


Acc# 












sp:PUR9_HAEIN 


P43852 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|10736257_t3_80 | 


|1497 


1 1 3417 


ii" i 


|198 




Protein name 










Locus Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|12556337_rl_31 


1498 


3418 


| pu 


|387 | |477 | 


|2.5e-45 | 


Protein name 










Locus Name 


Acc# 












sp:PUR9_EC0LI 


P15639 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|1272283_t2_60 | 


|1499 


| |3419 


i r i 


p« I P 44 I 


|7.9e-20 | 


Protein name 










Locus Name 


Acc# 



sp:AARF_ECOLI 



Description 
UBIQUINONE BIOSYNTHESIS PROTEIN AARF 



P27854:P27 
855:P76764 
:P27853 



381 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Probability 


1131700 ci 108 


I pnnr 


3420 


255 


758 215 


r.4e 


-16 


Protein name 










Locus Name 




Acc# 


putative peptidyl- 


-prolyl 


cis-trans 


isomerase 




gp:ASAJ2315 


AJ002316 


Description 














Acinetobacter sp. 


alkR & alkM 


genes, okfi 


& ORF4. 




i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — ■ i Score 
Length 


Probability 


13876562_ci_128 


[1501 


| |342i 


i i 


|228 | |73 | 


10.016 



Protein name 



Locus Name 



immunoglobulin Kappa light chain variable 



gp:AF131155 



Acc# 
AF131156 



Description 



Mus musculus immunoglobulin Kappa light chain variable region gene, partial 
cds . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


13947127_c3J>17 


| [1502 


| |3422 


i i 


|1755 | [1215 | 


|1.6e 


-163 


Protein name 










Locus Name 




Acc# 












sp:SYQ_HAEIN 


P43831 


Description 
















(GLNks) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


14852035_cl_129 


| |1503 


| (3423 


1 F 1 


258 |70 | 


|0.033 


Protein name 


Locus Name 


Acc# 



tat protein 



|gp:imW85775 



Description 



HIV-1 clone ZAM184-5.2 trom Zambia, tat protein (tat; gene, partialcds, rev 
protein (rev), vpu protein (vpu) , and envelopeglycoprotein (env) genes, 
complete cds and nef protein (nef ) pseudogene . 



ORF Name 



15553417 n 42 



NTID AAID 
[1504 



NT AA 
Length Length 

m 1 im" 



Score 



Protein name 
Description 

rar 



Locus Name 



Probability 



Acc# 



382 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|16583425_cl_131 


| |150«> 


3425 


326 


|981 | 535 


1.8e 


-51 


Protein name 










Locus Name 




Acc# 


ytjB protein 


pir:B65040 




B65040 


Description 
















ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


|i9632665_c2_I60 


| |1506 


| |34 2S 


1 1 


|2091 | |633 | 


|2 ,3e 


-79 | 


Protein name 










Locus Name 




Acc# 












sp : COPA_ENTHR 


P32113 :Q47 
841 


Description 














COPPER/ POTASSIUM - 


TRANSPORTING ATP AS E A, 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


19706557_c3_193 


1507 


| 3427 


| 216 


|65i | |I4S 


|4.8e- 


-08 | 


Protein name 










Locus Name 




Acc# 


probable component ot cation transport tor 


pir:K7i813 




E71813 


cbb3-type oxidase 






























Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


21753552_c3_220 


1508 


3428 


168 


|507 | 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


„ — . , Score 
Length 


Probability 


2197962_11_11 


| |i509 


| |3429 


122 


P 6S 1 






Protein name 










Locus Name 




ACC# 



Description 
[MO -HIT 



383 



ORF Name 



NTID 



AAID 



22145253 c2 111 



T5I7T 



3430 



NT 
Length 
2TTJ 



AA 

— _ Score Probability 
Length L 



792" 



1.6e-57 



Protein name 



Description 



Locus Name 



sp:ORN__HAEIN 



Acc# 
P45340 



OLIGORIBOMUCLEASE, 1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


22272900_tl_5 


1511 


1 P 431 


1 227 1 


i s84 i 


253 


1.2e-22 



Protein name 



Locus Name 



nypotnetical protein 



Description 



PST243354 



Acc# 
AJ243354 



Pseudomonas stutzeri nypl and comA genes and 
exbD genes . 



putative tolQ, ext>B,tolR and 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — t , Score 
Length 


Probability 


22285902_c3_212 


1 I 1512 


1 P 432 


| 229 | 


590 | |299 | 


1.8e 


-26 


Protein name 








Locus Name 




Acc# 


transposase s±r2062 rprotein sir2062 :protein 


pir :S74909 




S74909 


slr2062 


























Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — ■ i Score 
Length 


Probability 


22710402_C2_154 


| 1513 


3433 


78 | 


237 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


234b7632Jt3_90 


1 1514 


| |3434 


i asb i 


|888 | 88S | 


1.5e- 


■88 | 



Protein name 

Description 
(EC 2.1.1.-) 



Locus Name 



|sp:UBIE_ECOLI 



ACC# 
P27851 



384 



ORF Name 



123475002 tl 9 



Protein name 



Description 



NT ID AAID 



NT 
n 



AA 

— Score 
Length Length 



5T9~ 



T2T" 



Probability 
4.8e-06 



Locus Name 



sp:CTTF_ECOLI 



Acc# 
P40710 



COPPER HOMEOSTASIS 


PROTEIN 


CUTE PRECURSOR (LIPOPROTEIN NLPi!) 




ORF Name 


NTID 


NT AA 
AAID _ — ^, . — . , Score 
Length Length 


Probability 


|23554676_ti_16 


| pi* 


|343<5 384 |1155 | |692 | 


4.1e-68 | 



Protein name 



Locus Name 



sp:AARF_ECOLI 



Description 

| UBIQUINONE BIOSYNTHESIS PROTEIN AARF 



Acc# 

P27854:P27 
855 :P76764 
:P27853 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|2363465<!>_c3_200 


| 1517 


| 3437 


|97 


|294 | |147 | 


|2.3e-10 


Protein name 








Locus Name 


Acc# 



Description 



sp : YEAC_ECOLI 



P76231 



HYPOTHETICAL 10.3 KD PROTEIN IN ANSA-GAPA INTERGENIC REGION 


ORF Name 


NT AA 

NTID AAID , . , . 

Length Length 


Score Probability 


|24015950_cl_147 


(1518 | |3438 | |201 | \€06 


|207 | |1.0e-16 | 



Protein name 



Locus Name 



Hypothetical protein 



(gp:AP157493 



Acc# 
AF157493 



Description 



Zymomonas mobilis 


ZM4 tosmid clone 42D7, complete sequence. 




i 


ORF Name 


NT 

NTID AAID „ L , 

Length 


AA 

. — . , Score 
Length 


Probability 


|24259702_tl_10 


|1519 | 3439 | 500 | 


|1503 |275 | 


|2.4e 


i 


Protein name 




Locus Name 




ACC# 






sp:YF46_ARCFU 




028726 



Description 
HYPOTHETICAL PROTEIN AF154i> 



385 



ORF Name 
|24303583_tlJ30 



NTID 
"j |152U 



AAID 



NT AA 

— — Score 

Length Length 

ST 



Probability 
l.le-12 



Protein name 



Locus Name 



small DNA binding protein Fis 



|gp:AF040379 



Acc# 
AF040379 



Description 



Proteus vulgaris nbosomal protein Lll metnyitransterase (prmA) gene , partial 
cds; yhdG homolog gene, complete cds; and small DNAbinding protein Fis (fis) 
gene, partial cds. 



ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


24306510_c3_209 


|152i | |3441 


1 ^ 4 1 


ps | 




1.0e-41 | 


Protein name 






LOCUS 


Name 


Acc# 








sp:EST2_PSEFL 


Q53547 


Description 












CARBOXYLESTERASE 2, 


{ESTERASE 11) 










ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


246137b2_c2_168 


|1522 |3442 


241 | 


I 725 1 


813 


6.2e-Ul 



Protein name 



Locus Name 



superoxide dismutase , (Mn) : SodA protein 



pir:JC5542 



Acc# 
JC6542 



Description 



ORF Name 



NTID 



AAID 



124614125 12 6b 



[1523 | 



NT 



AA 

— Score 
Length Length 



[1566 | 



Probability 
1.0e-160 



Protein name 



Locus Name 



penicillin-binding protein IB 



E 



p:API47449 



ACC# 
AF147449 



Description 



Pseudomonas aeruginosa strain paui peniciiiin-Eindmg protein IB (ponB) gene, 



complete cds 



ORF Name 



NTID 



AAID 



NT 



AA 



Length Length 



Score 



124640762 13 19 



Protein name 

Description 
ADENOS YLTRANS FERASE ) (ADOMET SYNTHETASE) 



[1167 | [1412 | 

Locus Name 



Probability 
|2.ie-144 



sp:METKJECOLI 



ACC# 

P04384 :P30 
869 



386 



ORF Name 



2BBB13ttb tl 17 



NTID 



AAID 



NT AA 

— , — , Score 

Length Length 

T7T 



55T 



Probability 
2 . 3e-26 



Protein name 



Locus Name 



aaenme pnosphonbosyltransterase, :protein 
S111430 rprotein slll430 



lpir:S75440 



Acc# 
S75440 



Description 
ORF Name 



NTID AAID 



NT 



AA 



Length Length 



Score Probability 



12566b962 tl B 



Protein name 



152S | [ 3446 | [115 | |34B | [75 1 [0.0099 ~ 

Locus Name Acc# 



glutamyi-tRNA (Gin; amidotranst erase summit 



E 



ir:D704U4 



D70484 



Description 



ORF Name 



NTID 



AAID 



NT AA 
— , — , Score 
Length Length 



310 



Probability 
] P 3 | |i.5e-4i — 



Protein name 



Locus Name 



hypothetical protein 



pir : S76006 



Acc# 
S76006 



Description 

ORF Name 
|306b02b0 tl 12 



NTID 
"j [1528 



AAID 



AA 

— , Score 
Length Length 



NT 



] EZZZ1 [ 



Probability 
3.9e-6b 



Protein name 



Locus Name 



conserved hypothetical protein 



E 



ir:A7b2b6 



ACC# 

A75256 



Description 



ORF Name 



NTID 



AAID 



AA 

— , Score 
Length Length 



31541442 c3 IBi 



NT 
n 

TT7 



Probability 



1954 



TT5~ 



6.5e-30 



Protein name 



Locus Name 



putative peptidyl-prolyl cis-trans isomerase 



A5AJ2316 



Acc# 
AJ002316 



Description 

Acinetobacter sp. AD PI alKR & alicM genes, ORFl & ORF4 . 



387 



ORF Name 


NT ID AAID 


NT AA 
Length Length SC ° re 


Probabil i ty 


[32694687 c2 181 


| [1530 3450 


140 1 423 1 1117 1 
1 III 


1 


Protein name 




Locus Name 


Acc# 






sp:TOBB_BACSU 


P50728 


Description 








HYPOTHETICAL 40. 


7 KDjFROTKININ FER-RECQ INTERGENIC REGION 


i 


ORF Name 


NTTD AAID 


NT AA 
— — Score 
Length Length 


Pr obab i 1 i ty 


33213555 c3 216 


1 11531 1 3451 
1 1 1 


I |60 1153 | 1106 | 

II I ill 


|1.3e-05 


Protein name 




Locus Name 


Acc# 






gp:EOT82664 


U82664 


Description 








Escherichia coli 


minutes 9 to 11 


genomic sequence . 


1 


ORF Name 


NTID AAID 


NT AA 
— — Score 
Length Length 


Probability 


33245927 c3 213 


[1532 | 3452 


| 229 [690 | 606 


5.3e-59 | 


Protein name 




Locus Name 


ACC# 






sp:YCEV_ECOLI 


P75957 


Description 








HYPOTHETICAL ABC 


TRANSPORTER ATP 


-BINDING PROTEIN YCEV 


i 


ORF Name 


NTID AAID 


NT AA 
— , — , Score 
Length Length 


Probability 


33394050_t3_76 


| [1533 | [3453 


| [269 | [810 | [340 | 


|8.2e-3i | 



Protein name 



Locus Name 



BE ECOLI 



Description 

HYPOTHETICAL 26.9 KB PROTEIN IN PURE- PPIB INTERGENIC REGION 



Acc# 

P43341:P77 
440 



ORF Name 



NTID 



AAID 



3465 t3 99 



NT AA 

— — Score 

Length Length 

7T 



] E^lD 



Protein name 



Description 



Locus Name 



Probability 



Acc# 



[NO -HIT 



388 



ORF Name 
|3906S61_t:TTB" 



NT ID AAID 



NT AA 
— — Score 

Length Length 



7T7~ 



TIT" 



Probability 
9.7e-28 



Protein name 



Locus Name 



bp : STMBLDA 



Acc# 
M80628 



Description 

Streptomyces griseus transter RNA-Leu (bldA) gene and ORF, completecds . 



ORF Name 
|3910593_t2^3y 



NT ID 



[TS7T* 



AAID 



NT AA 
Length Length 
TT2 



— . , Score 



] ED 



Protein name 

Description 
(ROTAMASE B) 



Locus Name 



Probability 
|1.5e-S0 

ACC# 



sp:CYPB_ECOLI 



P23869:P78 
052 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|3944178_t2_52 


|I537 


| 3457 


328 


|987 | 




Protein name 








Locus Name 


ACC# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|39532i8_ci_i25 


|i538 


|3458 


943 | 


J2832 155 


|i.ie-iO 



Protein name 



Locus Name 



Pnoc protein 



|gp:KPM250377 



Acc# 
AJ250377 



Description 



Klebsiella pneumoniae partial selD gene tor SelD protein and pnoCgene tor 
PhoC protein. 


NT AA 

ORF Name NTID AAID _ — L , „ — . , Score 

Length Length 


Probability 


3991527_t2j57 |1539 3459 2142 | |6429 | 


577 


|4.0e-5i 



Protein name 
Description 

Haemophilus intluenzae hst gene, complete cds. 



Locus Name 



gp:U4i852 



Acc# 
U41852 



389 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|4322793_r2_57 


1540 


3460 


217 


654 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4410943_c3_219 


1 1 1541 


1 P 46i 


ir i 


|276 | 103 


|l.le-05 | 


Protein name 








Locus Name 


Acc# 



sp : YGFY_ECOLI 



Q46825 



Description 

1 HYPO T HE T ICAL 10.5 KB PROTEIN IN VLDB-B5LA IM T ERQEHIC REGION 



ORF Name 



14688887 C3 211 



NT ID 



AAID 



][ 



3462 



NT AA 
Length Length 
W^l ] [1359 



Score 



TT2" 



Probability 
|7.9e-07 



Protein name 



Locus Name 



metal transporter Nramp4 



E 



p:AF202540 



Acc# 
AF202540 



Description 



Arabidopsis thaliana metal 


transporter Nramp4 mRNA, complete 


cds . 


ORF Name NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4782812_cl_14i |1543 | 


|3463 


i i i4v i 


|444 | |95 | 


|0.01i | 


Protein name 






Locus Name 


Acc# 


| nypotnetical protein TM1026 






J pir:A72303 


J A72303 


Description 










ORF Name NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4798430_cl_151 | |1544 | 


3464 


i 4 " i 


J1362 | |447 | 


|7.2e-62 | 



Protein name 



Locus Name 



gp:5C3745 



Description 

I S.cerevisiae chromosome XIII cosmid 9745. 



ACC# 

Z38114:Z71 
257 



390 



ORF Name 



NTID 



AAID 



5125318 c3 206 



NT AA 
Length Length 
ITS 1 11071 



Score 



Probability 
5.0e-2i 



Protein name 



Description 



Locus Name 



|gp:ATAC007i$8 



Acc# 
AC007168 



Arabidopsis tnaiiana chromosome II BAC T26C19 genomic sequence, complete 
sequence . 



ORF Name 



15192757 ci 144 



NTID 
] |154£ 



AAID 



AA 

— Score 
Length Length 



NT 

] EEE 



TIT" 



Probability 
i.5e-72 



Protein name 



Locus Name 



sp:YCEW_ECOLI 



Acc# 
P75958 



Description 

HYPOTHETICAL 45.3 KT> PROTEIN IN MFD -COBB INTERGENIC REGION 



ORF Name 



5343752 t3 101 



NTID 

]EHI 



AAID 



NT 
Length 



AA 

_ — _ Score 
Length 



Protein name 

Description 
RIBOSOMAL PROTEIN Lll METHYLTRANSFERASE , 



EED ED 

Locus Name 



Probability 

|1.5e-61 

Acc# 



sp : PRMA_ECOLI 



P28637:P76 
680:P76681 



ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|7u42580_cl_142 


|1548 | |3468 | 


p> | 


|229 | 




Protein name 






Locus Name 


Acc# 


Description 










NO-HIT 1 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|7312717_t3_77 


| |1549 | 3469 


75 | 


|228 |77 | 


|0.028 


Protein name 






Locus Name 


Acc# 


conserved hypothetical protein 262 




pir :S59078 


S59078 



Description 



391 



ORF Name 



NT ID AAID 



1822680 C3 218 



t^tt 



NT AA 

— — Score 

Length Length 

WT1 1 



Probability 
8.8e-82 



Protein name 



Locus Name 



glyceraldehyde- 3 -phosphate dehydrogenase 



Acc# 
M87647 



Description 



Bacillus megatenum glyceraldehyde -3 -phosphate dehydrogenase 
(gap) ,phosphoglycerate kinase (pgk) , and triose phosphate isomerase 
(tpi) genes, complete cds. 



ORF Name 



NT ID 



AAID 



9767325 t3 82 



NT AA 
Length Length 
147 | |444 



Score Probability 
|4 76 | |3.2e-45 — 



Protein name 



Locus Name 



transposase homolog A 



|gp:H»U9S957 



Acc# 
U95957 



Description 



Helicobacter pylori insertion sequence IS606 transposase homoiogs A^tnpA; 
and B (tnpB) genes, complete cds. 



ORF Name 



NT ID 



AAID 



NT 



AA 



Length Length 



Score Probability 



112535413 E3 5 



Protein name 



Description 



] \ ±BB2 1 P 472 | | 813 | p442 | [1030 | |4.2e-129 ~ 

Locus Name Acc# 



sp:UP05_KCOLI 



P39170:P39 
181:P77465 



UNKNOWN PROTEIN PROM 2D-PAGE SPOTS M62/M63/03/09/T35 PRE CURSOR 



ORF Name 



NT ID 



AAID 



NT 



AA 



— , — , Score Probability 
Length Length *- 



131671880 tl 2 



][ 



I3T7T 



Protein name 



185 | |558 | |350 | |5.2e-33 ~~ 
Locus Name Acc# 



FabZ 



El 



NMU7 94 81 



U79481 



Description 



Neisseria meningitidis ~~~ " 

UDP-3-0- (R-3-hydroxymyristoyl) -glucosamineN-acyltransf erase (lpxD) gene, 
partial cds, and3 (R) -hydroxymyristoyl acyl carrier protein dehydrase (fabz) 
andUDP-N-acetylglucosamine acyl transferase (lpxA) genes, complete cds. 



392 



ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 




36148427_£3_8 


1554 3474 


67 


poi | 






Protein name 






Locus Name 


Acc# 




Description 












NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 




4412963_t2_4 


| [1555 | |3475 | 


r i 


|558 | |470 | 


|1.4e-44 




Protein name 






Locus Name 


Acc# 










|sp:LPXA_ECOLI 


1 P10440: 


P78 


Description 








243 


(EC 2.3.1.129) 


( UDP - N - ACE TYLGLUCOS AMINE ACYLTRANS FERAS E ) 




i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 




468^S40_t2_3 


| 1555 | 3475 | 


340 | 


1023 | |667 | 


|i.8e-65 


i 


Protein name 






Locus Name 


Acc# 










sp:LPXD_HAEIN 


P43888 


Description 












(EC 2.3.1. -J 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 




|ii978127_±i_2 


| 1557 | |3477 | 


379 1 


1140 811 | 


|1.0e-80 


i 


Protein name 






Locus Name 


Acc# 










sp:YECP_ECOLl 


1 P76291: 


O07 


Description 








983 


HYPOTHETICAL 37 


.0 KD PROTEIN IN AtJPy 


-BI3Z INTERGENIC REGION 






ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — , . Score 
Length 


Probability 




14658562_i3_6 


| 1558 | |3478 


309 | 


|930 835 ; 


2.9e-83 




Protein name 






Locus Name 


Acc# 










spiYEDIJ^COLl 


1 P46125: 


P76 


Description 








332 


HYPOTHETICAL 32 


.2 KD PROTEIN IN DSRB 


-VSR INTERGENIC REGION 







393 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


23714375 t3 8 


1559 


3479 | 


1100 1 
1 1 


|i03 | |70 | 


|0.033 


Protein name 










Locus Name 




Acc# 


outer membrane 


protein H.8 


precursor 






pir:S04i!>7 


S04157 


Description 
















ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24253427_t3_7 


1 1550 


1 3480 1 
1 1 


85 | 


p5S | 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 














i 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|24322153J:3_5 


1 I 1561 


||»4»1 | 


I 257 1 


|774 | |475 | 


4.1e 


i 


Protein name 










Locus Name 




Acc# 












sp:YECO_HAEIN 




Description 














P43985 :P43 
986 


HYPOTHETICAL PROTEIN HI0319/320 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length - 


Probability 


24804$5i_c2_16 


] |1562 


||Mlil | 


F 1 


|189 | |17i | 


|3.1e- 


-12 


Protein name 










Locus Name 




Acc# 












sp:SSP2_PLAY0 


Q01443 


Description 
















SPOROZOITE SURFACE PROTEIN 


2 PRECURSOR 












ORF Name 


NTID 


AAID 


NT AA 
— — Score 
Length Length 


Probability 


35i81956_ci_ii 


1563 


3483 


251 


1 1 






Protein name 










Locus Name 




Acc# 



Description 
NO-HIT 



394 



ORF Name 



NT ID 



AAID 



NT AA 
— — Score 
Length Length 



3955437 c2 19 



T5W 



[7T 



Probability 
TTE | |2.ie-09 — 



Protein name 



Locus Name 



peptide metnionme sulfoxide reductase 



pir:E75345 



Description 



ORF Name 



5117337 F3 9 



NTID 
] [1555 



AAID 



Score 



NT AA 
Length Length 

] |ii40 | pres" 



T7? 



Protein name 



Locus Name 



serine -pyruvate aminotransterase 



:F752S9 



Description 



ORF Name 



NTID AAID 



1053441 c3 193 



AA 

— , Score 
Length Length 



NT 
n 



] i 



Protein name 



Locus Name 



hypothetical protein 25 



pir:1 , 13514 



Description 



ORF Name 



NTID 



AAID 



10650681 c3 218 



3TET 



NT 

H 
P77 



— Score 
Length Length 



[37T9~ 



[7T 



Protein name 



Locus Name 



unknown 



|gp:AF050676 



Description 



Acc# 
E75345 



Probability 
p.0e-116 

Acc# 
F75269 



Probability 
1.4e-21 



Acc# 
T13514 



Probability 
|i.0e-05 



Acc# 
AF050676 



Pseudomonas aeruginosa lipoprotein (oprX) and terric uptaJceregulator (tur) 
genes, complete cds; and unknown genes. 



ORF Name 



119027 c2 166 



NTID 



AAID 



13488 



NT AA 
Length Length 
[51T 



Score Probability 



] F I EUD 



Protein name 



Description 



Locus Name 



ACC# 



395 



ORF Name 



NTID 



AAID 



11227302 C3 221 



T3W 



NT AA 
Length Length 

1 \m — 



Score 



Protein name 



Locus Name 



proaaoie tatty-acid- -CoA ligase, taODV 



pir:C5947i 



Probability 
10.012 



Acc# 
C69471 



Description 



ORF Name 


NTID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


12773910_c2_171 


| |1570 


| |3490 


1 116 1 


pbi | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|12973332_rl_18 


| |1571 


| |349I 


1 " 1 


i 152 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length ■ 


Probability 


12992125_c2_170 


1572 


| 3492 


152 


|459 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


13085160_c3_195 


1573 


3493 


236 


pii | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|1371003_t3J}7 


| 1574 


| |3494 


1 41 * 1 


(1242 | |1088 | 


|4.5e-110 | 


Protein name 








Locus Name 


ACC# 


Na+/H+- exchanging protein :Na+/H+ antiporter 


pir:JX0!i60 


JX0360 



Description 



396 



ORF Name 



114647033 c3 209 



Protein name 



NTID AAID 



NT 
Length 
TT2 



AA 

T — _u Score 
Length 



Probability 
1.8e-07 



Locus Name 



Acc# 



muramoyi-pentapeptiae caraoxypeptidase 




pir :T34747 


T34747 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


[14745253_c3_2I4 | 


|I575 |3496 


|46« | 


I 1 


407 | |596 | 


S.le 




Protein name 








Locus Name 




Acc# 


nypotnetical protein Rv3734c 




pir :G70797 


G70797 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

t — , i Score 
Length - - 


Probability 


15633253_cl_149 | 


J1577 | |3497 


1 l lb0 1 


I 453 1 






Protein name 








Locus Name 




ACC# 


Description 














NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — , , Score 
Length 


Probability 


|164U590<!>_c3_192 | 


|I578 | [3498 


1 P 4 1 


|I455 |207 | 


2.3e- 


-14 


Protein name 








Locus Name 




ACC# 



sp:VG17_BPMD2 



064210 



Description 
MAJOR HEAD PROTUIN BFT7 



ORF Name 
|U595716_t2_69 

Protein name 
Description 
MO-HIT 



NTID 



AAID 



raW- 



NT 
Length 

] PZZ 



AA 

r — Score 
Length 

] E!0 

Locus Name 



Probability 



Acc# 



397 



ORF Name 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



1659 7 827 cl 135 



T5W 



Probability 
10.022 



Protein name 



Locus Name 



putative pronead. protease 



|gp:AF181080 



ACC# 
AF181080 



Description 



Rhodofcacter capsulatus putative large termmase, putative portaiprotem, ana 
putative prohead protease genes, complete cds; andputative capsid protein 
gene, partial cds. 



ORF Name 


NT 

NTID AAID — , 
Length 


AA 
Length 


Score 


Probability 


|19547875_ci_I57 


| [1581 | |3501 | |124 | 


P 75 1 


i 185 1 


|8.2e-15 | 



Protein name 



Locus Name 



mono -Heme c- type cytochrome ScyA 



E 



AF044582 



Acc# 
AF044582 



Description 



Snewanella putretaciens NrtG homolog gene, partial cds; andmono - neme c- type 
cytochrome ScyA (scyA) , cytochrome c maturationprotein A (ccmA) , cytochrome c 
maturation protein B (ccmB) , cytochrome c maturation protein C (ccmC) , 
cytochrome c maturationprotein D (ccmD) , and cytochrome c maturation protein 
E (ccmE) genes, complete cds. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|196972<=>5_c2_179 


| (1582 


| (3502 


1 1 65 1 


I 198 1 I 75 1 


J0.020 


Protein name 








Locus Name 


Acc# 



sp:VC67_ASTLO 



P34778 



Description 

I HYPOTHETICAL 20.1 KD MOTE IN VCT67 (ORF170J 



ORF Name 



120917082 t3 105 



NTID 
] [1583 



AAID 



][ 



NT AA 
Length Length 
71 1 



Score Probability 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT 



398 



ORF Name 


NTID 


AAID 


NT 


AA 

. — . , Score 
Length 


Probability 


|2I6634I0_tl_I7 


1584 


|3504 


170 | 


1 






Protein name 










Locus Name 




ACC# 


Description 
















DSfO-HlT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|22351557_c3_191 


| 1585 


|3505 




|270 | |70 | 


|0.0039 | 


Protein name 










Locus Name 




Acc# 


hypothetical prot 


em F26B6 


.23 






pir :T01147 


T01147 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|22381542_c3_199 


1585 


| |3505 


i 2y ' i 


|774 | |457 | 


|3.3e- 


1 


Protein name 










Locus Name 




Acc# 


minor tail protein L nomolog : protein gpl8 


pir:T13104 




T13104 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|23437561_r2_52 


| 1587 


p»7 


| 690 | 


2073 |1985| 


|4.0e- 


-205 


Protein name 










Locus Name 




Acc# 



Description 



'sp:5YMHAEIN 



P43828 



(METRSJ 


ORF Name NTID 


AAID 


NT 
Length 


. — . , Score 
Length 


Probability 


|23549217_cl_144 | 1588 


| [3505 


1 ^ 1 


|579 | 113 


|5.4e-05 | 


Protein name 






Locus Name 


Acc# 


nypotnetical protein 






2 pir:T14651 


T14651 



Description 



399 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Probability 


|23847257_c2_167 


1589 


3509 


125 


P 78 1 




Protein name 








Locus Name 


Acc# 


Description 












ttJO-HIT 


ukj? in a. me 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


ri UJJdUl llLy 


243i6886_t3_iI5 


| (1590 


| pno 


i i ui i 


|456 | |356 


1.7e-32 


Protein name 








Locus Name 


ACC# 










|sp:VDCQ_ECOLI 


| P76107 


Description 












| HYPOTHETICAL 16 


. 1 KD PROTEIN IN TEHB-ANSP INTERGENIC REGION 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — ■ i Score 
Length 


Probability 


24415876_±2_4B 


-J to. 1 


1 P 511 


1 l ib4 1 


|465 




Protein name 








Locus Name 


Acc# 


Description 












[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


|24417540_c2_187 


1592 


3512 


| |209 


|630 | |600 | 


p.3e-58 | 


Protein name 








Locus Name 


Acc# 










gp:XCRPFB 


"J Y09700 


Description 












X.campestris rptB gene. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|24431265_c2_182 


| |1593 


1 P 513 


1 1 


J1461 | 1261 | 


|2.1e-128 | 


Protein name 








Locus Name 


ACC# 










sp:SYC_ECOLI 


| P21888 



Description 



| (CYSR5) 



400 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


JW61443I_c2_I73 


| 1594 


|3514 


|1S9 


|510 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


2463i552_ c 2_I8i 


| |1595 


| pus 


pi 


|81S | |723 | 


|2.1e 


-71 


Protein name 










Locus Name 




Acc# 


tniamm biosyntnesis protein tniG 




pir :B70487 


B70487 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24882676_c2_163 


1 1 1596 


|3516 


198 


|597 | |261 | 


1.9e- 


-22 | 


Protein name 










Locus Name 




Acc# 












sp:YE18_HAEIN 




P44189 


Description 
















HYPOTHETICAL PROTEIN HI1418 i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


25397700_cl_148 


| 1597 


3517 


221 


| pes | 


|6.7e- 


-36 


Protein name 










Locus Name 




ACC# 


minor tail protein 


gp2 0 






pir:T13106 




T13106 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


. — , , Score 
Length 


Probability 


25493762_cl_161 


| |1598 


| |3518 


r i 


p. | 






Protein name 










Locus Name 




Acc# 



Description 
[WO -HIT 



401 



ORF Name 


NT ID 


AAID 


NT 
Length 


„ — . , Score 
Length 


Probability 


25584627_c3_216 


1599 


3519 


60 


183 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


258I542_c3_213 


| JI600 


| |3520 


i r i 


pu | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


26819002_ci_141 


| 1601 


3521 


90 | 


|273 | |72 | 


|0.020 


Protein name 








Locus Name 


Acc# 


hypothetical prot 


ein yorB 






|pir:T12887 




Description 










1 T12887 :C69 

922 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


276927_r2_58 


1602 


| 3522 


330 | 


| [in | 


p.OOlfi 



Protein name 



Description 



Locus Name 



sp:FINQ_ECOLI 



ACC# 
P18809 



PINO PROTEIN 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


2792176_c2_180 


| 1603 


1 3523 


112 


339 112 


1.2e-06 


Protein name 








Locus Name 


Acc# 



sp : YRKP_BACSU 



P54433 



Description 

HYPOTHETICAL 20.7 KD PROTEIN IN BLTR-SPOIIIC 1NTERGENIC REGION 



402 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


|29337908J:1JJ7 


| 1604 


J3524 


32 | 


p4S | 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT | 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


|31678827_c3_222 


| 1605 


i 


1 P 44 1 


[735 | |527 


1.3e-b0 | 



Protein name 



Locus Name 



long-cnam-tatty-acid-coA iigase 



bp:AF150669 



Acc# 
AF150669 



Description 



Pseudomonas putiaa long-cnain-tatty-acia-coA iigase 
cds . 


(tadD) 


gene, complete 


ORF Name NTID AAID Jj^ 


Score 


Probability 


|320S2552_t3_102 |1606 | 35:26 61 | |186 


I 54 


| (0.0065 



Protein name 

Description 
HYPOTHETICAL PROTEIN MJ068i 



Locus Name 



sp:Y683_METJA 



Acc# 
Q58096 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

r — . i Score 
Length 


Probability 


|320775i_t3JL03 


1607 


| |3527 




I 381 1 




Protein name 








Locus Name 


ACC# 


Description 












[NO-HIT [ 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


35187543_c3_217 


1608 


| 3528 


|378 


1137 |105 | 


|0.0058 | 



Protein name 



Locus Name 



AdcB protein 



gp: SPADCA 



Acc# 
Z71552 



Description 
streptococcus pneumoniae adcRCBA operon. 



403 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


O ~) / O J. «J v»-X XJO 


" 1609 | 


3529 


233 


702 368 


8 . 9e 


-34 j 


Protein name 










Locus Name 




Acc# 












sp : CYC4_PSEST 


Q52369 


Description 
















1 CYTOCHROME CM 
1 


PRECURSOR 
















ORF Name 


NT ID 


AAID 


NT 
Length 


. — . , Score 
Length 


Probability- 


|402217_c3_194 


|1610 


3530 


1 161 1 


I486 1 






Protein name 










Locus Name 




Acc# 


Description 


















[NO-HIT 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4069212_c3_19b 


| r xx 


|3531 


118 | 


357 |83 | 


|0.017 | 


Protein name 










Locus Name 




Acc# 












sp:Y182_METJA 


Q57641 


Description 
















HYPOTHETICAL , 


PROTEIN MJ0182 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4331b63_c2_i72 




3532 


1179 


|3540 | 181 | 


1.2e 


-09 


Protein name 










Locus Name 




Acc# 


1 unKnown 




gp:AF011378 


AF011378 


Description 














Bacteriopnage 


ski complete 


genome . 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length - ■ 


Probability 


|44IS938_c2_i77 


|1S13 


3533 


1627 


14884 11863 1 


|3.4e 


-198 


Protein name 










Locus Name 




ACC# 


tail tip tiser protein gp2i 




pir:T13107 


T13107 



Description 



404 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|4861263_c2_169 


1514 


3534 


121 | 


I 3 " 1 






Protein name 










Locus Name 




Acc# 


Description 
















NO -HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|4867819_c2_152 


| |1615 


| |353b 


i r i 


|59i | |404 | 


1.4e 


-37 


Protein name 










Locus Name 




Acc# 


1 nypotnetical protein HP1334 




pir :F64686 


F64686 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|5130075_c2_186 


| |1S15 


| 353S 


| 431 


\1296 |858 | 


l.le- 


1 


Protein name 










Locus Name 




Acc# 



sp : DFP__HAEIN 



P44953 



Description 

E>NA/ PANTOTHENATE METABOLISM FLAVOPROTEIN HOMOLOG 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


553437_ri_28 


1517 


3537 


91 


276 






Protein name 








Locus Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|6375032_c2_175 


| |1518 


1 P 538 


1 F 2 1 


|849 


poi | 


|l.le-33 



Protein name 



Locus Name 



j minor tail protein gpl9 



pxr:TI3I05 



Description 



Acc# 
T13105 



405 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability- 


|682777_cl_145 


1519 


3539 


135 


|408 | 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|583187_cl_135 


| |1620 


1 p*«„ 


i 71 i 


P 


16 1 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|5925452_r3_100 


| |152i 


i p 541 


i p i 


P 


07 | |69 | 


10.042 


Protein name 










Locus Name 




Acc# 


hypothetical prot 


em APE0740 






pir :E72664 


E72664 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|790807_tl_15 


| |1522 




r i 


|305 






Protein name 










Locus Name 




ACC# 


Description 
















NO-HIT 1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , Score 
Length 


Probability 


830300_tl_21 


1523 


3543 


55 


198 






Protein name 










Locus Name 




Acc# 



Description 
MO-HIT 



406 



ORF Name 



NTID AAID 



1665782 iSS 



NT AA 
Length Length 
75T3 1 12253 



Score 



T7T 



Probability 
6.3e-12 



Protein name 



Locus Name 



|gp:AB030825 



Acc# 
AB030825 



Description 

Pseudomonas aeruginosa genomic DNA, partial sequence, strain :PAOl. 



ORF Name 



114175056 tl 2 



Protein name 



Description 



NTID AAID 



NT AA 
Length Length 

1 F 



— . , Score 



12 04 



TIT" 



Probability 
4.5e-07 



Locus Name 



[gp : ABCARRA 



Acc# 
X70360 



A.brasilense carR 


gene . 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|23831527_c3_33 


1626 


| 


674 


2025 615 


i.7e-74 | 


Protein name 








Locus Name 


Acc# 


protem-disultide 


reductase 






gp:AF010322 


AF010322 



Description 



Pseudomonas aeruginosa protein- disul tide reductase (dipZ) andcatafcolic 
dehydroquinase (aroQ) genes, complete cds . 



ORF Name 



126276961 c2 28 



NTID 
] [1627 



AAID 



[T5TT 



NT AA 
Length Length 
1 [T2T5" 



Score Probability 
[1607 | |4.5e-165 



Protein name 



Locus Name 



cnioroacetaidenyde denydrogenase 



|gp:AP023733 



Acc# 
AF029733 



Description 



Xanthobacter autotrophicus linear plasmid pXAUl 
chloroacetaldehydedehydrogenase (aldA) gene , complete 


cds . 




NT AA 

ORF Name NTID AAID , — . , , — 

Length Length 


Score 


Probability 


|33581285_c2_24 |1628 |3548 | |512 | |1539 


i 1211 1 


4.ie-123 | 



Protein name 



Locus Name 



sp:Y736_HAEIN 



ACC# 

P44849 



Description 

HYPOTHETICAL SODIUM-DEPENDENT TRANSPORTER HI0736 



407 



ORF Name NTID AAID 


NT 
Length 


AA 

, A , Score 
Length 


fxo.Da.Di i i uy 


5312692J:3JL5 1629 | 3549 


1 1 411 1 


Izj b | 1U / d \ 


ii . le- lUo 


Protein name 




Locus Name 


Acc# 


soclium/proton-ciepenaent alanine carrier pr 


pir:C69972 


C69972 


homo log yrbD 














Description 








ORF Name NTID AAID 


NT 
Length 


AA 

x , Score 
Length 


rrOJjaJjll lLy 


|6152307_c2_26 | |1630 | |3550 


1 P«'' 1 


11154 1 11087 1 
1 III 


|5.7e-ii0 | 

1 1 


Protein name 




Locus Name 


Acc# 






sp:CYDB_ECOLI 


P11027 


Description 








BD-I OXIDASE SUBUNIT 11) 


ORF Name NTID AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


|781461_c2_25 | | p551 


1 P u 1 


1443 | |1563 | 


|2.1e-160 | 


Protein name 




Locus Name 


ACC# 






sp:CYDA_AZOVI 


Q09049 


Description 








CYTOCHROME D UBIQUINOL OXIDASE SUBUNIT I, 


ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


125143_cl_35 | |1632 | |3552 


1 1 


P 49 | |1J7 | 


|5.7e-09 | 


Protein name 




Locus Name 


Acc# 










profcaole enoyl-CoA nydratase 




pir :G7bbbV 


G75557 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|126322Sb_c3_48 | |163!i 


|239 


|720 | |136 | 


|8.4e-06 | 


Protein name 




Locus Name 


Acc# 


prooaole erythrocyte -binding protein MAKtfii 


pir :T09127 


T09127 



Description 



408 



ORF Name 



NT ID AAID 



AA 

— — , Score 
Length Length 



1 1306442b tl 6 



NT 
n 
T5TT 



TOT 



Protein name 



Locus Name 



Probability 
|4.4e-55 

Acc# 



Sp : HEM6_ECOLI 



P36553 



Description 

(COPROPOUPHYRlMOaENASE) (COPR OGEN OXIDASE) 



ORF Name 

Protein name 

Description 
I CYTOCHROME C 



NT AA 
NT ID AAID — ^ _ — Score 
Length Length 



] I 1 " 5 | I 3555 | | 158 1 | 477 | | 

Locus Name 



Probability 
2.3e-12 



sp:CYCP_ALCSP 



Acc# 
P00138 



ORF Name 



1 1952 77 tl 11 



NT ID 
] [1636 



AAID 



][ 



3555" 



NT 
n 



AA 

— _ Score 
Length Length 



T2HT 



Probability 
|B.2e-47 



Protein name 



Locus Name 



ORF3 96 protein 



bp:&SDMaC 



Acc# 
Z73914 



Description 
Pseuaomonas stutzeri orti'/b gene. 



ORF Name NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


[i97137_c2_4<5 | |1637 | |3bbV 


1 I 710 1 


|2133 | |965 | 


|2.5e-i56 


Protein name 




Locus Name 


Acc# 






sp:DXS_HAE!W 


j P45205 


Description 








[ i-DEOXYXYLULOSE-5-^HOSPHATE SYNTHASE (DXP SYNTHASE; 


ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|22697263_ci_37 | |1538 | |3558 


1 P 4 1 


315 |87 | 


|0 . 0022 


Protein name 




Locus Name 


Acc# 


probable enoyl-coA nydratase 




p±r :E70868 


j E70868 



Description 



409 



ORF Name 



NT ID AAID 



124323500 ti 5 



TZTT 



NT AA 
Length Length 
T7T — 



Score 



Probability 
7.0e-41 



Protein name 

Description 
(COPROPORPHYRINO^ENAg E ) (COPROOEN OXIDASE!) 



Locus Name 



sp:HEM6_EC0LI 



Acc# 
P36553 



ORF Name 



130120325 ci 32 



Protein name 



Description 



NT ID 



AAID 



T5W 



EEZH! 



NT AA 
— — Score 

Length Length 

[TIT 



Tin- 



Locus Name 



Probability 



Acc# 



NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


33449042_C3_54 


|1S41 


|3551 


| 126 


P 7I | 


294 





Protein name 



Locus Name 



SuhB 



Description 



[gp 



:AF010139 



Acc# 
AF010139 



Azotobacter vinelandu iron-sultur cluster assembly gene cluster , sunB, 
cysE2, iscS, iscU, iscA, hscB, hscA and fdx genes completecds; ndk gene, 
partial cds . 



ORF Name 
|33986312_r2_i6 
Protein name 

Description 



NTID 



AAID 



NT AA 
Length Length 
2773 — 



— . , Score 



TIT 



Probability 
|4.3e-50 



Locus Name 



sp:GCH2_HAEIN 



Acc# 
P44571 



STP CVCLOHYDROLASE II, 


ORF Name NTID AAID , — . , 

Length 


AA 

. — . , Score 
Length 


Probability 


|3544108<5_c2_43 1643 | |3563 | |149 | 


|4b0 | |% | 


|0.011 | 


Protein name 


Locus Name 


Acc# 


cell wall -binding protein homolog yvcE 


] pir:P70031 


~* F70031 



Description 



410 



ORF Name 



NTID 



AAID 



5859703 cl 33 



NT AA 
Length Length 
T&L 1 11395 



Score 



7u"5~ 



Probability 
I.7e-69 



Protein name 



Description 



Locus Name 



ECOFOLC 



Acc# 
J02808 



E.coii toic gene encoding toiyipoiygiutamate-dinydrotoiatesyntnetase, ana a 
protein required for its expression, completecds . 



ORF Name 



NTID AAID 



11045926 ci 177 



T£¥5~ 



[J5F5" 



NT AA 
Length Length 
TT7 1 



Score 



F5T" 



3T5T 



Probability 
|3.2e-29 



Protein name 



Locus Name 



yrp protein: multiple regulator protein 



pir :S70842 



ACC# 
S70842 



Description 
ORF Name 



NTID 



AAID 



NT AA 
Le^th Le^th ^re 



Probability 



105fl93ii_c3_274 | [1645 | [3555 | |401 | [1206 | |3.3e-153 
Protein name Locus Name Acc# 



nbonucieoside-dipnospnate reductase, beta 
chain 



pir:C64135 



C64135 



Description 
ORF Name 



NTID 



AAID 



110602250 t2 9b 



TST7" 



][ 



3567 



NT 
n 
Ttt 



AA 

— Score 
Length Length 



[25T~ 



Protein name 



Locus Name 



Probability 
|2.2e-21 

ACC# 



aluminum tolerance protein 



pir :PC4440 



Description 



PC444 0:PC4 
514 



ORF Name 



10751006 ci 182 



Protein name 



NTID 



Description 
A.brasilense carR gene. 



AAID 



] i 



NT AA 
Length Length 
T53 1 W&l — 



Score Probability 
|125 | |b.0e-08 



Locus Name 



gp : 



ACC# 
X70360 



411 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


11912951_i:i_20 


|1649 


3569 


107 


324 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Lenqth 


AA 

, — . , Score 
Length 


Probability 


|12972I6_c3__2B9 


| |1550 


ip 5, ° i 


r i 


|558 | |125 | 


|b.0e-08 


Protein name 








Locus Name 


Acc# 


colicm V production prote 


in homo log 




(pir:E70195 


E70195 



Description 



ORF Name 



114275330 Tl 68 



ii 



NT ID 
1651 



AAID 



13571 



NT AA 
Length Length 



Score Probability 
| [1470 | |377 | |9.9e-35 ~ 



Protein name 



Description 



Locus Name 



sp:Y4WB_RHISN 



Acc# 
P55680 



HYPOTHETICAL ZINC PROTEASE - LIKE 


PROTEIN Y4WB 




i 


ORF Name NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|14508500_c2_247 1652 3572 


1 1 


|1542 | |154S | 


|i.3e-I58 


Protein name 




Locus Name 


ACC# 


amidophosphonbosyl transferase , 




pir :XQEC 




Description 






1 F65003:A92 

366 :A92367 



:S01389:I5 



ORF Name 



NT ID AAID 



14900187 t3 134 



TF5T" 



NT AA 
Length Length 

rrxs — 



Score 



ED 



Probability 
4.9e-33 



Protein name 



Locus Name 



probable 2-hydroxyhepta-2 , 4-diene-l, 7-dioate 
isomerase b!180 



pir :A64S64 



Acc# 
A64864 



Description 



412 



ORF Name 



NT ID 



AAID 



TF5T" 



T57T" 



NT AA 
Length Length 

rm — 



TIT 



Score 



nrrr 



Probability 
12 .5e-27 



Protein name 



Locus Name 



RpsA 



1 |gp:AF035937 



Acc# 
AF035937 



Description 



Pseudomonas aeruginosa strain IATS 06 RpsA (rpsA) gene, partialcds ; 
Ihf-Beta, Wzz (wzz) , and Wzx (wzx) genes, complete cds; andwbp gene cluster 
for O-antigen biosynthesis, complete sequence. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

Length Score Probability 


16i94442_t3_135 


|I655 


3575 


| 453 


1377 | |1285| |6.0e-131 | 


Protein name 








Locus Name Acc# 










sp : PUR2_SALTY P26977 


Description 










RIBONUCLEOTIDE 


SYNTHETASE) 


t PHOSPHORIBOSYLGLYCINAMIDE SYNTHETASE ) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

Length Score Probability 


1682B790J:1_43 


1555 


3575 


292 


879 398 5.9e-37 



Protein name 

Description 
HYPOTHETICAL PROTEIN HI0432 



Locus Name 



sp:YJAD_HAEIN 



Acc# 
P44710 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


19S98381_ci_I89 


|I657 


3577 


1 fe0 1 


I 183 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


I1972931_t2_63 


1558 


3578 


p i 


pov | |b7 | 


J0.023 


Protein name 








Locus Name 


Acc# 



Description 
Rattus norvegicus unknown mRNA. 



413 



ORF Name 


NTID 


AAID 


NT 
Length 


Score 

Length 


rrODdDll 1 uy 


20601558_cl_163 




3579 


273 


1822 1 1737 1 
1 III 


17 . oe-73 
1 


Protein name 








Locus Name 


ACC# 










sp:YGHU_ECOLI 


Q46845 


Description 












HYPOTHETICAL 34.2 


KD PROTEIN IN GSP- 


-HYBG INTERGENIC REGION 


i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|2111556_cl_164 


1660 


| 3580 | 


1383 | 


|4152 |161 | 


|1.3e-29 | 



Protein name 

Description 
EXODEOX YR I BONUCLE A5 E V GAMMA CHAIN, 



Locus Name 



sp:EX5C_HAEIN 



Acc# 
P44945 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|21642556_c3_272 


| 1661 


3581 


87 


264 "J 




Protein name 








Locus Name 


Acc# 


Description 












[NO-HIT I 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length - - 


Probability 


|22751387_c3_271 


| (1662 


| |3582 


1 1 


|UI | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|23611527_c3_275 


| |1663 


| 3583 


114 | 


|345 | |140 '| 


|1.3e-09 | 


Protein name 








Locus Name 


Acc# 



sp:YFAE_HAE!N 



P45154 



Description 
HYPOTHETICAL PROTEIN HI1309 



414 



ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23676035_c3_262 1664 3584 


410 


|1233 |177 | 


|1.3e-10 


Protein name 




Locus Name 


Acc# 


YtiP 




1 |gp:APu08220 


^ " AF008220 


Description 


Bacillus suJDtilis rrnB-dnaB genomic 


region. 






ORF Name NTID AAID 


NT 
Length 


AA 

, — , , Score 
Length 


Probability 


|23725387_c2_244 1665 | 3585 | 


318 | 


|957 | [1046 | 


|1.3e-105 


Protein name 




Locus Name 


ACC# 






sp : FTSY_HAEIN 


P44870 


Description 








CELL DIVISION PROTEIN FTSY 


ORF Name NTID AAID 


NT 
Length 


AA 

„ — , , Score 
Length 


Probability 


|23728465_cl_161 | 1666 | 3586 


|925 


2778 2856 | 


2 . Oe-297 


Protein name 




Locus Name 


ACC# 


pyruvate denydrogenase (iipoamiae) 




~] gp : AZPDHE 


1 Y15124 


Description 


1 Azotofcacter vineiandu pdnE gene. 


ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|23989512_c3_268 1667 | 3587 | 


393 


1182 [1066 | 


|9.6e-108 


Protein name 




Locus Name 


Acc# 






sp : PHEA_PSEST 


P27603 


Description 








ORF Name NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24303127_ci_17i 1668 | |3588 | 


|407 


|1224 | |797 | 


|3.ie-79 


Protein name 




Locus Name 


Acc# 


carJDoxynor spermidine decarboxylase 




gp:VIBCAMSDC 


D31783 



Description 



Vibrio alglnolyticus nspC gene tor carJDoxynorspermidinedecarJDoxylase ( CANS 
DC), complete cds. 



415 



ORF Name 



|244077b0 c!4 2b3 



NTID 



AAID 
1 15589 



NT AA 
Length Length 

fizz — 



— . . Score 



Probability 



Protein name 



Description 



| 7 4 7 | |503 | | l.le-b8 ~ 

Locus Name Acc# 



|sp:DCOP_HAEIN 



P43812 



DECARBOXYLASE) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|24642711_C2_216 


| (1570 


||3b90 | 


|773 


|2322 | |950 | 


|1.6e 


-96 | 


Protein name 








Locus Name 




Acc# 










sp:AROA_BACSU 




P20691 


Description 














( 5 - ENOLPYRuvYLSHIKIMATE- 3 


-PHOSPHATE 


SYNTHASE) (EPSP SYNTHASE) 




i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|25025987_c2_232 


| 1671 


II 3591 1 


|207 


524 484 


|4.5e- 


i 


Protein name 








Locus Name 




Acc# 



Description 



sp:YRBH_ECOLI 



P45395 



HYPOTHETICAL lib. 2 


KD PROTEIN IN MUPA-RPOM IHTERGKNIC REGION 


(0328) | 


ORF Name 


NT AA 

NTID AAID . . , . . . Score 

Length Length 


Probability 


|2543i62!>_c3_25i 


| |1S72 | |3592 | |105 | |321 | |224 | 


|1.6e-ifl 



Protein name 



Description 



Locus Name 



sp : IHFB_ERWCH 



Acc# 
P37983 



INTEGRATION HOST 


FACTOR BETA-SUBUNIT (IHF-BETA) 






ORF Name 


NT AA 

NTID AAID „ — . , _ 

Length Length 


Score 


Probability 


|25445452_ci_144 


| 1573 | 3593 |215 548 


pi3 | 


|S.0e-28 



Protein name 



Locus Name 



conserved nypothetical protein 



pir :F75285 



ACC# 
F75285 



Description 



416 



ORF Name 



125564402 c3 285 



Protein name 



NT ID 



AAID 



NT 
|n 
[775" 



AA 

— Score 
Length Length 



2220 



\5T 



Probability 
|9.2e-06 



Locus Name 



nypotnetical protein SCI7.24C 



pir :T36920 



Acc# 
"J T36920 



Description 

ORF Name 
125359451 c2 249 



NT ID 

I F 7 *" 



AAID 



][ 



T5U5~ 



NT 
lie 

7TT 



AA 

, , _ — ^. Score 
Length Length 



Protein name 



Description 



[2142 [ [2272 | 

Locus Name 



Probability 
|i.5e-235 

Acc# 



sp:TJV£B_PSEAE 



P72174 :P72 
147 



EXCINUCLEASE ABC 


SUBUTJIT 


B 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|26750090_c2_24b 


| 1676 


| 3596 


| 347 


[1044 | |880 | 


|4.9e-88 


Protein name 








Locus Name 


Acc# 



Description 



sp:PYRD_SALTY 



P25468 



| (DHODEHASE) 


ORF Name NTID AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


|2923562_c2_233 1677 |3597 | 


177 | 


b34 352 | 


4.4e 


-32 


Protein name 






Locus Name 




Acc# 








sp:YRBI_ECOLI 


P45396 :P45 
398 


Description 










HYPOTHETICAL 20.0 KD PROTEIN IN MURA 


-RPON INTERGENIC REGION 






ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length - 


Probability 


|29336052_ti_4i | 1678 3598 | 


r | 


1425 1440 | 


|9.1e 


-46 


Protein name 






Locus Name 




Acc# 


| abci protein nomoiog TibBifo.i4 




pir :T02007 


T02007 



Description 



417 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Probability 


|30173201_r2_94 


| 1579 


3599 


66 


pu | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


30469092_cl_151 


| |1680 


| |3600 


i p 50 


i p" 1 1 1 " 


|6.4e-09 j 


Protein name 








Locus Name 


Acc# 


unknown 








|gp:MLCL622 


I Z95398 


Description 


Mycobacterium leprae cosmid L622 . 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


30600453_r3_139 


| |168I 


| |360i 


i r v 


| 2094 |777 | 


|2.0e-86 | 


Protein name 








Locus Name 


Acc# 


hypothetical protein b2324 






pir:B65005 


B65005 


Description 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


30720027_t3_141 


1682 


3602 


| 154 


465 382 


|2.9e-3b | 


Protein name 








Locus Name 


ACC# 


hypothetical protein 






gp : PPPAL1 


1 X74218 


Description 


Pseudomonas put 1 da 


ruvB, tolQ, tolR, tolA, 


tolB and oprL genes . 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


31800280_cl_158 


| |1683 


13603 


| iOb 


|918 | |649 | 


i.be-63 | 


Protein name 








Locus Name 


Acc# 


hypothetical protein 






gp : PFFC2 


| Y11998 



Description 



P. tluorescens FC2.1, FC2 . 2 , FC2 . 3c, FC2.4 and FC2 . 5c open readingtrames . 



418 



ORF Name 



NTID 



AAID 



131828211 t2 69 



NT AA 

— , — , Score 

Length Length 

Tim 1 13627 



Protein name 



Locus Name 



proline denydrogenase 



ATU39263 



Probability 
|0.0 

Acc# 
I U39263 



Description 



Agrobacterium tumetaciens piasmid pAtRio proline denydrogenase (put A) and Prp 
(prp) genes, complete cds. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


33229667_c3_270 


|1685 


| |3605 


72 


219 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|33985930_tl_23 


| 1686 


3606 


288 


I 867 1 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


34062503_cl_178 


| 1687 


| 3607 


181 | 


|546 | |224 | 


1.6e-18 


Protein name 








Locus Name 


Acc# 



Description 



sp:YHBN_HAEIN 



P45074 



HYPOTHETICAL PROTEIN HI1149 PRECURSOR 


NT 

ORF Name NTID AAID . — ^ 

Length 


AA 

t — , i Score 
Length 


Probability 


|34172883_cl_176 | |1688 | |3608 | 166 | 


|50i |296 | 


|3.8e-26 


Protein name 


Locus Name 


Acc# 



spTTJEE^HSsrnr 



P44492 



Description 
HYPOTHETICAL PROTEIN HI0065 PRECURSOR 



419 



ORF Name 



NT ID AAID 



NT AA 

— _ „ — _ Score Probability 
Length Length 



34409658 t2 84 



TFF9" 



TOT 



1083 



[71T7~ 



|3.3e-75 



Protein name 



Locus Name 



carfcoxyl esterase 



|pir:S57530 



Acc# 
S57530 



Description 

ORF Name 
|35157165_cl_19i 



NTID 



AAID 



] 



][ 



T5TTT 



NT AA 
Length Length 



Score Probability 
2.7e-29 



Protein name 



Locus Name 



methylateci-DNA- -protein- cysteine 
S-methyltransf erase, 



pir:D64604 



Acc# 
D64604 



Description 
ORF Name 



NTID AAID 



13607213b c3 29b 



TF9T" 



3511 



NT 



AA 

— , — , Score 
Length Length 



Probability 
i.5e-54 



Protein name 



Locus Name 



Acc# 



sp:EX5A_EC0LI 



Description 










1 P04993:Q59 

378 


ALPHA CHAIN) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


36112900_t2_91 


1692 


3612 


103 


312 253 


i.4e-2i 


Protein name 








Locus Name 


Acc# 



Description 



gp:ECU24202 



U24202 



Eschencnia con ecok bo (yciD) gene, partial cds, ana (yciC) , (yciB) , — 
(yciA) , membrane protein (tonB) , (ycil) , putative potass iumchannel (kch) , and 
cardiolipin synthase (els) genes, complete cds . 



ORF Name 



NTID 



AAID 



136129676 c3 252 



^TI- 



NT 
n 

TJ5 



AA 

— Score 
Length Length 



Probability 



Protein name 



|420 | [85 1 [0,00086 

Locus Name 



hypotnetical protein yrvD 



pir :G6yy80 



ACC# 
G69980 



Description 



420 



ORF Name 



NT ID AAID 



NT AA 

— — _ Score Probabilit y 
Length Length £ - 



3915943 c2 226 



1245 



TTST 



B.Oe-118 



Protein name 



Locus Name 



sp:METZ_PSEAE 



Acc# 
P55218 



Description 

O-mJCCIMYLHOMOSERINE 5UL F HYDRYLA5E , (05H SULFHYDRYLAS E ) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 




1695 


| 3615 


1 1 381 i 


|1146 | 


708 | 


|8.3e-70 | 


Prrjf - p i n name 










Locus Name 


ACCff 


pi UJJaJJic pvuo pxuL> 


em 








pir:B7059i 


B70591 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


13933437 c2 229 
1 "~ ~ 


| |1696 


| |3616 


1 I 202 1 

J 1 I 


I 6 


09 | 


387 


8.6e-36 1 
1 


Protein name 










Locus Name 


Acc# 


hypothetical prote 


in Jhp0867 




pir :B71879 




Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4016943_i2_67 


1697 


3617 


473 


1422 


653 


5.6e-64 


Protein name 










Locus Name 


Acc# 












sp:Y4WA 


_RHISN 


P55679 


Description 
















HYPOTHETICAL ZINC 


PROTEASE 


Y4WA, 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


4103293_ci_179 


1598 


| J3618 


ip 45 


|738 | |822 


6.9e-82 



Protein name 



Locus Name 



putative ABC transporter ATP-bmding protein 



gp:AF013987 



Acc# 
AF013987 



Description 



Vibrio cnolerae strain 0395 putative ABC transporter ATP-bindingprotein, 
sigma54 (rpoN) , putative sigma54 modulation protein andnitrogen regulatory 
IIA protein (ptsN) genes, complete cds. 



421 



ORF Name 


NT ID 


AAID 


NT 
Length 


. — , Score 
Length 


Probability 


|4114702_cl_159 


1599 


| 3619 


119 


|350 | 194 


2 .4e-15 


Protein name 








Locus Name 


Acc# 


probable dihydroneoptenn aldolase 




pir:H65u93 


H65093 


Description 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4489453_r2_90 


|1700 


| |3520 


1 1 425 i 


|1278 




Protein name 








Locus Name 


Acc# 


Description 












NO -HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


4689693_c3_278 


| |I70I 


| (3621 


1 yn 1 


J1119 | |509 | 


8.2e-55 | 



Protein name 



Description 



Locus Name 



sp : MIAA_HAE IN 



Acc# 
P44495 



(IPP TRAN^t'EUAy^) 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . . Score 
Length 


Probability 


|4772050_±2_85 


1702 


| 3522 


441 


1325 487 


|2.2e-46 


Protein name 








Locus Name 


ACC# 



Description 



sp:DP3E_HAEIN 



P43745 



DNA POLYMERASE III, 


EPSILON CHAIN, 






i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — , , Score 
Length 


Probability 


|4815513_c3_294 


|1703 | |3523 


1318 


|3957 |230 | 


3.1e-41 


Protein name 






Locus Name 


Acc# 



Description 
BETA CHAIN) 



sp:EX5B_EC0LI 



P08394 



422 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


4863458_c2_234 


|1704 


| 3624 


172 


| 




Protein name 








Locus Name 


Acc# 


Description 












pJO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4875525_c3_293 


| |1705 


| |3625 


i r i 


r° i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4878135_c2_250 


| J1705 


| |362S 


350 1 


|1053 | |8i0 | 


|i.3e-80 



Protein name 



Locus Name 



yhdG homo log 



AF040378 



Acc# 
AF040378 



Description 



Serratia marcescens ribosomal protein Lll metnyltransrerase (prmA) gene, 
partial cds; and yhdG homolog and small DNA binding proteinFis (fis) genes, 
complete cds . 



ORF Name 



4881700 c3 290 



NTID 
"J [1707 



AAID 



TST7" 



NT 

.n 



AA 

— Score 
Length Length 



1440 



T8T" 



Probability 
3.7e-35 



Protein name 



Locus Name 



hypothetical protein 5 



pir :T00101 



ACC# 
T00101 



Description 

ORF Name 
|4884675_c3_283 



NTID 



AAID 



T7uTT 



NT 
n 
75? 



AA 

r — Score 
Length Length 



Probability 



1ZT 



|138 | | l.i>e-0 7 



Protein name 



Locus Name 



j hypothetical protein" 



gp:AF031940 



Acc# 
AF031940 



Description 

j SinornizoDium meliloti alconol dehydrogenase (acinA) gene, completecds . 



423 



ORF Name 


NT ID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


5086693_c3_277 


1709 


3629 


1 419 
1 


1260 |793 | 


|8T^e 


-79 


Protein name 








Locus Name 




Acc# 


nypotnetical protein slr004y 




pir :S74347 




S74347 


Description 














ORF Name 


NT ID 


AAID 


NT 

Length 


AA 

_ . , Score 
Length 


Probability 


|5098937_t2_51 


1710 


| 3630 


i b4i i 


|1632 | |b^0 | 


[3 .le 


-53 


Protein name 








Locus Name 




Acc# 


probable exodeoxyribonuciease VI I 


large 


pir:C75549 




C75549 


subunit 


























Description 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ . , Score 
Length 


Probability 


5110963_cl_162 | 


1711 


| 3631 


559 


1680 | |1056 | 


|r71e 


-106 i 


Protein name 








Locus Name 




Acc# 










sp:0DP2_P«EAE 




Q59638 


Description 














COMPLEX, (E2) 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|5112763J:2_a9 | 


|1712 


| |3632 


1 V" 1 


|834 | |364 | 


|2.4e 


1 


Protein name 








Locus Name 




ACC# 










sp:YDGM_HAEIN 




P71396 


Description 














PUTATIVE FERREDOXIN- 


-LIKE 


PROTEIN 


HI1684 








ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— . , Score 
Length 


Probability 


5323750_r3_10U 


1713 


|3633 


104 | 


315 







Protein name Locus Name Acc# 

Description 



MO-HIT 



424 



ORF Name 



NTID AAID 



6484691 tl 26 



TTDT 



NT 
Length 
TT7 



AA 



Le ~ th Score Probability 



TT7T 



5.5e-73 



Protein name 



Locus Name 



sp:CYSP_ECOLI 



Acc# 
P16700 



Description 
THIOSULFATE-BINDING PROTEIN PRECURSOR 



ORF Name 



NTID AAID 



NT 
Length 



AA 

— , Score Probability 
Length 



1806512 c3 279 



] I 1715 | P<^b | |137 | 



Protein name 



|4i4 | |171 | |1.8e-12 ~ 
Locus Name Acc# 



polysialic acid capsule expression protein 



jpir : 



B70434 



B70434 



Description 
ORF Name 



NTID AAID 



11562 c3 7 



T7TT 



NT 
Length 
75 



— , Score Probability 



AA 
Length 

pr 



Protein name 
Description 
[NO-HIT 



Locus Name 



Acc# 



ORF Name 



NTID AAID 



20395432 c2 6 



TTTT 



T5T7" 



NT 
Length 
777 



AA 
Length 



— — Score Probability 



Protein name 
Description 



Locus Name 



Acc# 



NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


36117135J:1_1 


| |1718 | |3638 


i i m i 


|1008 | 


pin | 


p.«e-136 j 



Protein name 



Locus Name 



malate dehydrogenase 



EE 



AFI09682 



Acc# 
AF109682 



Description 
Aquaspirillum arcticum malate dehydrogenase 



(MDH) gene, completecds. 



425 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


5582952 tl 2 


1719 


3639 


185 1 
1 1 


1258 1 
1 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 












orf Name 


N I ID 


AA1JJ 


NT 
Length 


AA 

Score 

Length 


FiODaD in cy 


13958403_tl_l 


1 11720 
1 1 


1 13640 1 
1 1 1 


1399 1 
1 1 


11200 1 12511 
1 1 1 


2.4e-127 1 
1 


Protein name 








Locus Name 


Acc# 










|sp:YLIG_EC0LI 


P75802 


Description 












HYPOTHETICAL 49.6 


KD PROTEIN IN MOEA-DACC INTERGENIC REGION 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


15505250 t2 4 


| 1721 


||iUl | 


|140 


|423 | 236 | 


4.8e-19 


Protein name 








Locus Name 


ACC# 


unKnown 








gp:AF025544 


AF026544 


Description 












Ralstonia eutropna pnt>F ana oeta-Jcetothiolase (JDKtB J genes , complete cds; and 
unknown genes . 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|20782550_t3_16 


1 1722 




242 | 


|729 | 933 


|i.2e-93 


Protein name 








Locus Name 


ACC# 










sp:MTNG_NEIG0 


P08455 


Description 












METHYLTRANSPKHASli 


NGOPlI) 


(M. NGOPII) 






i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


24328950_t2_8 


|1723 


|3643 


153 


|452 | 335 


2.2e-30 | 



Protein name 



Locus Name 



sp:YRFH_EC0LI 



Acc# 
P45802 



Description 

HYPOTHETICAL 15.5 KD PRO T EIN IN MRCA-PCKA IN T ERGENIC REGION (0133) 



426 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 


Probability 








Length 




29859790_c2_32 


1724 


3644 | 


|74 | 


p 5 i 




Protein name 








Locus Name 


Acc# 


Description 












[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|3942592_t2_<5 


| |1725 


1 P 64b 1 
I 1 1 


1252 1 

1 1 


1759 1 1741 1 
l_ J L J 


|2.Se-73 


Protein name 








Locus Name 


Acc# 


hypothetical protein, 26K 






H pir:JC5479 


| JC5479 


Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|4i03390J:3_I5 


(1726 


IP" 4 1 


85 


|258 | |323 | 


|5.2e-29 | 


Protein name 








Locus Name 


ACC# 










sp : MTNG_NE I GO 


P08455 


Description 












j ME THYLTRAN5 FERAS E 


NGOPII) 


(M. NGOPII) 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|42837_tl_2 


1727 


3547 


71 


|216 | |I70 | 


|2.5e-12 | 


Protein name 








Locus Name 


ACC# 










sp:MTNG_NEIOT 


| P08455 


Description 












| METHYLTRANSPKHASK 


NGOPII) 


(M. NGOPII) 









ORF Name 
|4975512_t27T 



NTID 



AAID 



][ 



IRS - 



NT 
Length 
FT3 



AA 
Length 

EffD 



Score 



Probability 
7.8e-129 



Protein name 



Locus Name 



threonine dehydratase, biosynthetic 



] bir:E75502 



Acc# 
E75502 



Description 



427 



ORF Name 


NT ID AA±Lr 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


7038307__il_3 


1729 3649 


127 


1384 1 1228 1 


|6 . le 


-19 


Protein name 








Locus Name 




Acc# 










sp:PAlF_HUMAN 




Description 












P24666:Q16 
035:Q16725 


(EC 3.1.3.48) 


{ ADIPOCYTE ACID PHOSPHATASE, 


ISOZYME ALPHA) 








ORF Name 


NT ID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


30175950_i:i_i 


| 1730 | 3550 | 


i" 


|234 | |292 | 


|5.8e 


1 


Protein name 








Locus Name 




Acc# 










sp:THIC_BACSU 




Description 












P45740:P71 
090 


THIAMINE BIOSYNTHESIS PROTEIN THIC 




ORF Name 


NT ID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


4470181_t3_5 


| |1731 | (3551 | 




|471 | 535 


|4.5e 


-62 | 


Protein name 








Locus Name 




Acc# 










sp:THIC_ECOLI 


P30136 


Description 














THIAMINE BIOSYNTHESIS PROTEIN THIC 




ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


7119001_t2_4 


| |1732 | (3552 | 


P 


|222 | |85 | 


|0.013 


Protein name 








Locus Name 




ACC# 










sp:YA51_HAEIN 




Description 












Q57180:O05 
043 


HYPOTHETICAL 


ABC TRANSPORTER ATP -BINDING PROTEIN HI1051 








ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24254702_t3_3 


| (1733 | |3S53 | 


|252 


|759 | 253 | 


|1.4e- 


-21 


Protein name 








Locus Name 




Acc# 










sp:YIAT_ECOLI 


P37681 


Description 














HYPOTHETICAL 


27.4 KD PROTEIN IN AVTA 


-SELB INTERGENIC REGION PRECURSOR 





428 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


25787500_c3_6 


1734 


3654 


1 1278 
1 1 


P 7 1 


|503 | 


|4.4e-48 1 
1 1 


Protein name 








Locus 


Name 


Acc# 










sp : BFRA_NEIGO 


P72080 


Description 














BACTEKIOFERRITIN A | 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|5133575_c3_7 


11735 


1 P 555 


1 1 


|489 | 


|489 | 


|1.3e-45 | 



Protein name 



Description 



Locus Name 



jsp:BFRB_NEIGO 



Acc# 
P77914 



BACTERIOFERRITIN B 


(BFR A) 


(BFR B) 








ORF Name 


NTID 


AAID 


NT 

Length 


AA 

. — . , Score 
Length 


Probability 


2i984375_ci_i0 


1736 


3555 


i 473 i 


1422 |70S | 


|8.3e-70 | 


Protein name 








Locus Name 


Acc# 



Description 



|sp:AIP2_YEAST 



P46681 



ACTIN INTERACTING 


PROTEIN 2 








ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23535910_tl_2 


1737 | 3557 


1 I 492 1 


1479 | |1489| 


|1.4e-152 | 


Protein name 






Locus Name 


Acc# 



|sp:YEG0_ECOLI 



Description 
PUTATIVE PROTEASE VEGQ, 



P76403 :O08 
007:008010 



ORF Name 
125577175 t3 5 



NTID 



AAID 



NT 
Length 

] 



AA 

T — i-u Score 
Length 



ST5" 



Probability 
|515 | |1.8e-49 ~ 



Protein name 



Locus Name 



site- speci tic DNA- methyl transferase 
(cytosine-specif ic) , HP1121 



pir :A64660 



Acc# 
A64660 



Description 



429 



ORF Name 



NTID 



AAID 



3926252 c2 13 



1739 



NT 
n 
225 



AA 

— Score 
Length Length 



1575" 



T3T" 



Probability 
8.6e-08 



Protein name 



Locus Name 



Terz 



|gp:APi58355 



Acc# 
AF168355 



Description 



Proteus mirafciiis tellurite resistance locus, complete sequence; ana unknown 



gene. 



ORF Name 


NTID 


_ _ NT AA 
AAID . — . , . — . , Score 
Length Length 


Probability 


3946943_tl_l 


|1740 


|3660 510 |1533 | |808 | 


|2.ie-80 | 


Protein name 




Locus Name 


Acc# 


1 OprM 




| gp:AB01138i 


AB011381 


Description 


Pseudomonas 


aeruginosa gene tor OprM, complete cds. 




ORF Name 


NTID 


_ _ NT AA 
AAID . — . , . — . , Score 
Length Length 


Probability 


|2ii0557_c2_3 


1 I 1741 


| |366i | |22i | |6G3 | |570 | 


|3.5e-55 | 


Protein name 




Loctis Name 


Acc# 



Description 



|sp:Y926_3YNV3 



P72872 



| HYPOTHETICAL 37 


.9 KT> PROTEIN SLL0926 








I 


ORF Name 


NTID AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


|15040887_t3_ll 


1742 | 3662 | 


|504 


|1512 2557 


|9.6e 


-266 | 


Protein name 






Locus Name 




Acc# 


unknown 


|gp:AP0393i2 


AF039312 



Description 



Moraxella catarrhalis strain 4223 transferrin binding protein A(tbpA) and 
transferrin binding protein B (tbpB) genes, completecds; and unknown gene. 



ORF Name 



4016563 ci 13 



NTID 
^ [1743 



AAID 



3553" 



NT 
n 
TOT 



AA 

— — Score 
Length Length 



] EO [ 



52" 



Probability 
] 10.00021 



Protein name 



Locus Name 



conserved hypothetical protein ykoJ 



bir:F69859 



Acc# 
F69859 



Description 



430 



ORF Name 



NT ID 



AAID 



4484567 1:1 1 



1744 



NT AA 
Length Length 
1 12700 



Score Probability 
14 555 | |0.0 



Protein name 



Locus Name 



transterrin binding protein A 



gp:AF039312 



Acc# 
AF039312 



Description 



Moraxeiia catarrhalis strain 4223 transterrin binding protein A(ttopA) ana 
transferrin binding protein B (tbpB) genes, completecds; and unknown gene. 



ORF Name 



NT ID 



AAID 



14775207 t2 9 



TTTT 



NT AA 
Length Length 
T7T 



] i 



5T2~ 



Score Probability 
1728 | | i.6e- 7 i ~ 



Protein name 



Locus Name 



transterrin binding protein A 



gp:AF039315 



Acc# 
AF039315 



Description 



Moraxeiia catarrnaiis strain Q8 transterrin binding protein A(tbpA) ancT 
transferrin binding protein B (tbpB) genes, completecds; and unknown gene. 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


33380275_tl_2 | 


|174S 


| [3555 | 




I 1 


98 | 




Protein name 
Description 










Locus Name 


Acc# 


NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|35351043_cl_7 | 


|1747 


| 36G7 


P 1 


| |9J | 


|0. 00048 


Protein name 










Locus Name 


ACC# 


pnospnate -binding protein, 
phosphate -repressible 




pir:I64120 


164120 








Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


3650156I_c3_9 | 


1748 


3558 | 


301 


|903 |842 | 


b.2e-84 | 


Protein name 










Locus Name 


ACC# 












sp:P5TC_HAETN 


P45191 


Description 














PHOSPHATE TRANSPORT 


SYSTEM 


PERMEASE 


PROTEIN PSTC 


i 



431 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|5960433_ci_8 


1749 


| |3569 


60 | 


I 183 I 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 




NT ID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|4429510_ti_i | 


|1750 


| |3670 


1 1 


(1434 | |1328 | 


|1.7e-135 | 


Protein name 








Locus Name 


Acc# 










sp:MANB_5ALM0 


Q01411 


Description 












PHOSPHOMANNOMUTASK , 


(PMM) 










ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

M , Score 
Length 


FiODaDlllty 


4459376_c2_lb | 


1751 


|3671 


294 1 
1 


1885 1 57b 
1 1 


i.oe-55 1 
1 


Protein name 








Locus Name 


Acc# 


conserved nypotneti 


cal protein 




pir : D75311 


D75311 


Description 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|10429517_cl_34 


1752 


3672 


413 


1242 573 


1.7e-55 


Protein name 








Locus Name 


Acc# 


conserved nypotnetical protein 




pir:A75525 


A75525 


Description 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|1238462b_c3_48 


|1753 


| pro 


362 | 


1089 | |386 | 


|l.le-35 | 


Protein name 








Locus Name 


ACC# 



sp : YGBO_ECOLI 



Q57261 



Description 

HYPOTHETICAL 39.1 KB FKOTETfl IN SURE-CYSC INTURgEK I C REGION 



432 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


15915625J:2_13 


1754 


|3674 


i 1 ™ i 


ibii | 






Protein name 








Locus 


Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


21573425_t2_I2 




| |M7> 


i h 4v i 


|1344 | 


r i 


|1.7e-48 


Protein name 








LOCUS 


Name 


Acc# 










sp:UBIH_ECOLI 


P25534 


Description 














|UBIH PROTEIN, 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


219593I_ci_29 


| 1755 


| | 3S 7 S 


1 ^ 


F 10 1 


P 1 


(0.0018 | 



Protein name 



Locus Name 



conserved nypotnetical protein aq_2l07 



pir :F70480 



Acc# 
F70480 



Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , Score 
Length 


Probability 


2235500i_t3J23 


1757 


3677 


73 






Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


2395403l3_c3_50 


| |1758 


3678 


,1175 


|528 | |ii9 | 


|3.8e-05 | 



Protein name 



Locus Name 



conserved hypothetical protein aq_21Q7 j pir :P70480 



Acc# 
F70480 



Description 



433 



ORF Name 


NTID AAID 


NT 
Length 


. — . , Score 
Length 


Probability 


25665885 C2 36 


1759 3679 


241 


726 389 


6.3e-45 


Protein name 






Locus Name 


Acc# 








sp:MTAE_SALTY 


Q08015 


Description 










TRNA- (MS L2 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|3oiooaao_c3_5i 


1760 3680 


pi | 


|606 | |186 | 


1.7e-14 | 


Protein name 






Locus Name 


Acc# 


nypotnetical prot 


ein aq_2l08 




pir:G70480 


G70480 


Description 










ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|3994052_±1_3 


| 1761 | 3681 


| 192 


|579 | |682 | 


|4.7e-67 | 


Protein name 






Locus Name 


Acc# 


probable dctp deaminase 




pir:B71565 


B71565 


Description 










ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|4109790_cl_28 


1762 3682 


|155 


|468 | 176 


2.2e-12 


Protein name 






Locus Name 


Acc# 


conserved nypotnetical protein aq_2i07 


pir :F70480 


F70480 


Description 










ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


4319837_c2_37 


| |1763 | 3683 


1 453 1 


1482 | 466 


3.7e-44 | 


Protein name 






Locus Name 


Acc# 



|sp:YJEP_ECOLT 



P31806 



Description 

I HYPOTHETICAL 54 . 7 KD PROTEIN IN PSD-AMIB- INTERBKNIC RKGJLON (URP1) 



434 



ORF Name 



NT ID 



AAID 



NT AA 
— — Score 
Length Length 



Probability 



4345068 F3 21 



Protein name 



|128 | |387 | |177 | |1.5e-13 
Locus Name 



sp:Y0HJ_EC0LI 



Acc# 
P33372 



Description 

HYPOTHET I CAL 14.5 KB PROTEIN IN PBPG -CDD INTEROENIC REGION 



ORF Name 
|4790537_tr^7" 



NT ID 



AAID 



NT 
Length 
[151 



Score 



Protein name 



Description 



AA 
Length 

|552 | |295 | |4.8e-26 ~ 
Locus Name Acc# 



Probability 



sp:YOHK_HAEIN 



P45146 



| HYPOTHETICAL 


PROTEIN HII298 






i 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|590u203_ti_i 


| [1755 | p686 


589 


|2070 | |1S09| 


|2.8e-i65 | 


Protein name 






Locus Name 


Acc# 



Description 



sp:REP_ECOLI 



P09980 



ATP-DEPENDENT 


DMA HELICASE 


REP, 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


5658527_c2_3i) 


| 17S7 


(3687 


1 I 184 1 


I 555 1 


|228 | 


5.<5e-18 | 



Protein name 



Locus Name 



conserved nypothetical protein aq_2l07 



pir :F70480 



Acc# 
F70480 



Description 

ORF Name 
|7226518__t2_17 



NTID AAID 
j |17<58 | |?^~ 



NT 
Length 




AA 
Length 
|30$ 



Score 



TIT 



Probability 
i.2e-06 



Protein name 



Locus Name 



nypotneticai protein 



|gp:POL010353 



Acc# 
AJ010393 



Description 



Pseudomonas oleovorans phal and phaF genes, 
ORF3. 



and ORFl, ORF2 (partial) and 



435 



/■"NT") T-l IkT —» ^ 

ORF Name 


NTXD 


7V 7\ TT> 
AAliJ 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 




1759 


3689 


1 1 


|462 | |338 | 


1.3e 


-30 


Protein name 










Locus Name 




Acc# 












sp : FMAH_BACNO 


P04953 
* 


Description 
















SUBUNITS PILIN) 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length ™ 


Probability 


36210875_r2_3 


| |1770 


|3690 


i s8j i 


2652 |3272 | 


|0.0 


1 


Protein name 










Locus Name 




Acc# 












sp:AC02_EC0LT 




^ P36683 :P36 


Description 














648 :Q59382 
















:P75652 


(ACONITASE 2) 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


14853143_cl_9 


1771 


3691 


1 \ m 


|2112 |1460| 


1.7e 


-149 | 


Protein name 










Locus Name 




Acc# 












sp:YHOT_NEIME 




Q51152 


Description 
















HYPOTHETICAL 8!3 . 1 


KD PROTEIN IN REGION E 












ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


16050817__tl_l 


|1772 


| |3692 


1 F h 1 


|67B | |203 | 


|2.7e- 


1 


Protein name 










Locus Name 




Acc# 


nypotnetical protein S110788 




pir:577018 


S77018 


Description 
















ORF Name 


NT ID 


AAID 


NT 
Length 


t — . ^ Score 
Length 


Probability 


10175877J:3_73 


| |1773 


1 | 3693 


264 | 


795 | 124 | 


2.2e- 


1 


Protein name 










Locus Name 




Acc# 


DnrD protein 




gp:PaT13171b 




AJ131715 



Description 



Pseudomonas stutzeri dnrD gene and ORF194 (partial) and ORF63 (partial ) . 



436 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|1019b250__t2_49 


] |1774 


| p«94 


1 91 1 


P 46 1 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


, — . , Score 
Length 


Probability 


|105469!i0_tl_18 


| 1775 


|3595 


p 35 i 


|720 | |576 | 


|8. le-56 | 


Protein name 








Locus Name 


Acc# 



Description 



sp : M0DB_HAEIN 



P45322 



| MOLYBDENUM 


TRANSPORT SYSTEM PERMEASE PROTEIN 


MODE 




1 


ORF Name 


NT 

NTID AAID — 

Length 


AA 
Length 


Score 


Probability 


|11113i52_t3_ 


70 | (1775 | |3696 142 | 


|429 


| 205 | 


p.. 76-16 



Protein name 



Locus Name 



nypotneticai protein APKi^yi 



pxr :D72603 



Acc# 
D72603 



Description 
ORF Name 



NTID AAID 



12367711 ±3 63 



TTTT 



NT AA 
Length Length 
ESS — 



Score 



T5TT 



Probability 
3.4e-41 



Protein name 



Description 



Locus Name 



sp:M0DD_AZ0VI 



ACC# 
P37732 



Protein name 



Locus Name 



putative chape rone 



gp:PSNARXL 



Acc# 
Y15252 



Description 



Pseudomonas aeruginosa narX, narL, narKl, narK2 , narG, narH, narJ,narl, 
nifM, moaA genes. 



| MOLYBDENUM 


TRANSPORT ATP- 


-BINDING 


PROTEIN MODD 




1 


ORF Name 


NTID 


AAID 


NT AA 
Length Length 


Score 


Probability 


|157I0327_ci_ 


84 | [1778 


| 3598 


266 |801 | 


f « | 


b.0e-40 | 



437 



ORF Name 



15781576 C2 103 



Protein name 



Description 



NT ID AAID 



TTTT 



NT 
n 
T21 



AA 

— Score 
Length Length 



|<=>72 | |594 | 



Locus Name 



sp : YADF_ECOLI 



Probability 
i.0e-57 



Acc# 

P36857:P75 
656 



HYPOTHETICAL 25.1 


KD PROTEIN IN HPT- 


•PAND INTERGENIC 


REGION 




ORF Name 


NTID AAID 


NT AA 
Length Length 


Score 


Probability 


19735i88_t3_58 


| 1780 |3700 


577 |2034 


1 I 484 1 


|3.8e-70 



Protein name 



Locus Name 



nitrate/nitrite sensory protein 



gp : PSNARXL 



Acc# 
Y15252 



Description 



Pseudomonas aeruginosa narX, narL, narKl , narK2 , narG, narH, narJ,narl, 
nifM, moaA genes. 



ORF Name 


NT 

NTID AAID „ — . , 
Length 


AA 

_ — . , Score 
Length 


Probability 


|19806552_t2_31 


|1781 |370i | 187 | 


|b,4 | p4 | 


5.5e-08 | 


Protein name 




Locus Name 


Acc# 


Notcn nomoiog 




^ gp:AF033013 


AF033013 


Description 


Bombyx mori Notcn nomoiog mRNA, partial cds . I 


ORF Name 


NT 

NTID AAID , . , 

Length 


AA 

T — . i Score 
Length 


Probability 


|19806552_r3_5i 


| 1782 3702 180 | 


|543 | 142 


i.3e-09 | 


Protein name 




Locus Name 


Acc# 


Notcn nomoiog 




gp:AF033013 


AF033013 


Description 


Botabyx mori Notch nomoiog mRNA, partial cds. 


ORF Name 


NT 

NTID AAID . . , 

Length 


AA 

. — . , Score 
Length 


Probability 


20423500_t2_33 


| |1783 |3703 | |269 | 


|8i0 | |420 | 


|2.7e-39 | 


Protein name 




Locus Name 


Acc# 



spTHOEB^SALTS 



Q56067 



Description 
MOLYBDOP T ERIN BIOSYN T HESIS MOEB PROTEIN 



438 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|20587686_tl_8 


| |1754 


| |3704 


1 1 1 " i 


po | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|20876387J:1_13 


| |1785 


| |3705 


i p j4 i 


|705 | |247 | 


5.9e-21 


Protein name 








Locus Name 


Acc# 



Description 



sp:YIIM_ECOLI 



P32157 



HYPOTHETICAL 25.5 


KD PROTEIN IN KDGT 


-OPXA INTERGENIC 


REGION 


(0234) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


21485952_c2_129 


| 1786 


|3706 | 


63 | 


l 1BS 1 






Protein name 








Locus 


Name 


Acc# 


Description 














pTO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|21673452_c3_14i 


| 1787 


|3707 


448 


|1347 | 


1455 


5.8e-149 



Protein name 



Locus Name 



nitrate extrusion protein 



Igp : PSNARXL 



Acc# 
Y15252 



Description 



Pseudomonas aeruginosa narX, narL, 
nifM, moaA genes. 



narKl, narK2 , narG, narH, narj,narl, 



ORF Name 



21688888 cl 75 



NTID AAID 

■] [1788 | prre- 



NT AA 
— — Score 
Length Length 



TTT 



Probability 
TUJT\ |1.8e-104 — 



Protein name 
Description 

I THIAMINE BIOSYNTHESIS PROTEIN TH1I 



Locus Name 



sp:THII_SALTY 



Acc# 

P55913 :O06 
955 



439 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


22000717_tl_JL9 


1789 




145 


438 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|22378418_tl_17 


| |1790 


| |3710 


1 1 


|852 | |502 | 


|5.Se-48 



Protein name 



Description 



Locus Name 



sp:MODA_HAEIN 



Acc# 
P45323 



| MOLYBDATE- BINDING 


PERIPLASMS PROTtllN PRECURSOR 




ORF Name 


NT AA 
NTID AAID _ — . , . — . , Score 
Length Leny Lh 


Probability 


2255403i_r3_60 


| |1791 | |3711 | |194 | |585 |499 | 


|1.2e-47 


Protein name 


Locus Name 


ACC# 



Description 



sp:MOAB_ECOLI 



P30746 



| MOLYBDENUM 


COFACTOR 


BIOSYNTHESIS 


PROTEIN B 






i 


ORF Name 




NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|24068812_tl_ 


12 | 


|1792 | |3712 


1 P M 1 


|777 | 


i" 1 1 


|3.1e-54 | 



Protein name 



Locus Name 



EE 



:P5NARXL 



Acc# 
"I Y15252 



j nitrate/nitrite regulatory protein " 
Description 

Pseudomonas aeruginosa narx, narL, narKl, narK2 , narG, narH, narJ^ari, 
nifM, moaA genes. 



ORF Name 
|2442337BJ:TrB" 



NTID 



AAID 



NT AA 
— — Score 
Length Length 



][ 



T7TT 



EE3 



Protein name 
Description 

[MO-HIT 



Locus Name 



Probability 



Acc# 



440 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


24423375_t2_32 


1794 


3714 


1 66 


poi | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|2465iS36J::L_I6 


| |1795 


|3715 


i r° i 


|603 | |310 | 


|1.2e-27 | 



Protein name 



Description 



Locus Name 



sp:Y903_£5YNY3 



Acc# 
Q55371 



HYPOTHETICAL 16.5 


KD PROTEIN SLR0903 












ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


25507260_tl_9 


| |I796 | |3716 | 


338 | 


1017 


92 


0.048 | 



Protein name 

I MHC class I antigen 



Locus Name 



pir:I57454 



Acc# 
157454 



Description 

ORF Name 
|275283_tlJT5- 



NTID AAID 

DEZHZH 



TTTT 



NT 
Length 
T£5 



AA 
Length 
[FST — 



Score Probability 
|7.9e-i0 



Protein name 



Locus Name 



Hypothetical protein Rv2453c 



pir :D70864 



ACC# 
D70864 



Description 

ORF Name 
|28b3437_tI3TT" 



][ 



NTID AAID 
11798 



[T7TS~ 



NT 
Length 





AA 
Length 

EEED 



Score Probability 



Protein name 
Description 
MO-HIT 



Locus Name 



Acc# 



441 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|29432768_C2__123 


1799 


3719 


91 | 


|276 | 




Protein name 








Locus Name 


Acc# 


Description 












NO -HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|30509B27_tlJ7 


| |1800 


| |3720 


134 


r 5 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|31453i«_c2_i28 


|1801 


| |372i 


| 77 


|234 | |197 | 


|1.2e-15 | 


Protein name 








Locus Name 


Acc# 


Hypothetical protein 






gp:AP213822 


AF213822 


Description 


Zymomonas mobilis 


strain 


ZM4 fosmid clone 42B3, complete sequence. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


|33331633_cl_83 


| 1802 


3722 


1 


|1566 | 2362 | 


4.4e-245 | 



Protein name 



Locus Name 



respiratory nitrate reductase beta subunit 



tap : PSNARXL 



ACC# 
Y15252 



Description 



Pseudomonas aeruginosa narX, narL , narKl , narK2 , narG, narH, narJ,narI, 
nifM, moaA genes. 



ORF Name 



NTID AAID 



133758515 tl 6 



TBTT3" 



TTTT 



NT AA 
— — Score 

Length Length 

ETJ ED 



Protein name 
Description 

KO-HIT 



Locus Name 



Probability 



Acc# 



442 



ORF Name 



|363!>1!>!>2 F3 61 



Protein name 



NT ID AAID 



NT AA 

— — Score Probability 
Length Length *■ 



1804 | [ 3 7 24 | |91 | |276 | p5~ | |3.8e-10 ~ 

Locus Name Acc# 



hypothetical protein ssrl527 



Description 



pir :S75710 



S75710:S75 
718 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|35371012_c2_102 


| |1805 | |3725 


i 50 i 


1183 1 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|3906555_12_36 


|1806 3726 


1 1 


|519 | 296 


3.8e 


-26 


Protein name 








Locus Name 




Acc# 


probable molybdenum- pterin- .binding -protein 




pir:557954 




S57954 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


4011062_cl_91 


1807 3727 


427 


1284 1152 


7 .4e 


-117 


Protein name 








Locus Name 




ACC# 


nitrate extrusion 


protein 






gp:P5MARXL 


Y15252 



Description 



Pseudomonas aeruginosa narx, narL , narKl , narK2 , narG, narH, narJ^narl, 
nifM, moaA genes. 



ORF Name 
|4070308_crTTT 



NTID AAID 
j pig 



][ 



NT 
n 



AA 

— Score 
Length Length 



TTZT 



Probability 
I2.0e-5S 



Protein name 

Description 
MOL YBDOPTER IN BIOSYNTHESIS MOEA PROTEIN 



Locus Name 



sp:MOEA_HAEIN 



Acc# 
P45210 



443 



ORF Name 



NT ID AAID 



1434400:* 13 59 



TTZT 



NT AA 

r — ^ r — ^ Score 

Length Length 

353 1 [TuW 



7TT 



Probability 
7.0e-73 



Protein name 



Description 



Locus Name 



sp:MOAA_HAEIN 



Acc# 
P45311 



MOLYBDENUM COFACTOR 


BIOSYNTHESIS 


PROTEIN A 






1 


ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|4788876_c3_i33 | 


|1810 | |3730 


1 a4S 1 


|747 | 


m* | 


|2.2e-69 | 



Protein name 



Locus Name 



respiratory nitrate reductase gamma subunit 



Igp : PSNARXL 



Acc# 
Y15252 



Description 



Pseudomonas aeruginosa narX, narL, narKl, narK2 , narG, narH, narj,narl , 
nifM, moaA genes. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|4797093_r3_52 


1 l i8ii 


| 3731 


|157 


|474 | 




Protein name 








Locus Name 


Acc# 


Description 












pJO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


I4806502_c2_127 


1812 


| 3732 


102 | 


|309 | |117 | 


p.be-07 | 


Protein name 








Locus Name 


Acc# 


negative regulator 


ot translation 




gp:AP213822 


AF213822 



Description 

Zymomonas mobilis strain ZM4 tosmid clone 42B3, complete sequence . 



444 



ORF Name 



NTID AAID 



4885251 t2 35 



TXTT 



T7TT 



NT 
n 



AA. 

— Score 
Length Length 



Protein name 



[507 | | 



Locus Name 



Probability 
|2.7e-32 



molybdenum cotactor fciosyntnesis protein c 



bp:AFi08766 



Acc# 
AF108766 



Description 



Rhodobacter spnaeroiaes AsmA (asmA) gene, partial cas; YfcaU 
(ybaU) ,anthranilate synthase component I (trpE) , YibQ (yibQ) , 
anthranilatesynthase component II (trpG) , 

anthranilatephosphoribosyltransf erase (trpD) , indole- 3 -glycerol 
phosphatesynthase (trpC) , molybdenum cof actor biosynthesis protein C 



ORF Name 


NTID 


AAID 


NT 
Length 


— Score 
Length 


Probability 


|4897576_c3_i47 | 


|1814 


1 P 734 


ii" i 


I 210 1 




Protein name 










Locus Name 


Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


5282811>_r3_56 | 


|181ii> 


3735 




| p | 


|0.(m | 


Protein name 










Locus Name 


Acc# 


MDP1 








\ gp:AB01344i 


AB013441 


Description 


Mycobacterium bovis 


gene 


tor MDP1, 


complete 


cds 






ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|S30053_r2_47 | 


|1816 


1 p-m 


1 P«* 1 


|I158 | |320 | 


i.ie-28 | 


Protein name 










Locus Name 


Acc# 


ORF3 96 protein | 


gp:PSDNGC 


1 Z73914 


Description 




Pseudomonas stutzeri ortl75 gene. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 




1817 


|3737 


1 Vb 1 


|228 | 





Protein name 



Description 



Locus Name 



ACC# 



[NO-HIT 



445 



ORF Name 



NTID 



AAID 



NT AA 
— , — , Score 
Length Length 



7054692 cl 86 



369 



Probability 
9.3e-39 



Protein name 



Locus Name 



NitM protein 



gp:PSNARXL 



Acc# 
Y15252 



Description 



Pseudomonas aeruginosa narx, narL, narKl, narK2 , narG, narH, narj,nari, 
nifM, moaA genes. 



ORF Name 



7225637 E3 bO 



NTID 
] [1815 



AAID 



\TTTT 



NT AA 
Length Length 
T7J5 1 15220 



Score Probability 
|540 | |6.2e-50 ~ 



Protein name 



Locus Name 



tilamentous Hemagglutinin- lixe protein 
PspA: probable secreted protein 



pir :T09083 



Acc# 
"I T09083 



Description 

ORF Name 
| 98147bi_ci_131 

Protein name 



NTID 



AAID 



AA 

— , Score 
Length Length 



][ 



NT 
5ng 
j |TT7T 



Probability 
5uTF] |0.0 ~ 



Locus Name 



alpha -summit ot nitrate reductase 



|gp:PFU713W 



Acc# 
U71398 



Description 



Pseudomonas tluorescens nitrate reductase alpna-suounit (narG) 
andbeta-subunit (narH) genes, partial cds . 



NT 

ORF Name NTID AAID . — . , 

Length 


AA 

. — . , Score 
Length 


Probability 


1176576_t3_28 | 1821 | 3741 | |157 | 


|474 | |314 | 


4.7e-28 | 


Protein name 


Locus Name 


ACC# 




sp:YAII_ECOLI 


1 P52088:P75 


Description 




703 


Hypothetical 17.0 kd protein in proc-arol intergenic region 


NT 

ORF Name NTID AAID , — ^ 

Lengtn 


AA 

. — . , Score 
Length 


Probability 


14S47507_t2_20 |1822 | |3742 | 405 | 


|1218 | 386 


l.Ie-35 | 


Protein name 


Locus Name 


ACC# 


conserved Hypothetical protein aq_74U 


pir :A7036b 


A70365 



Description 



446 



ORF Name 



NT ID 



AAID 



NT AA 
— — Score 
Length Length 



Probability 



234437b2 El 23 



1823 



] [ 



1986 | |2 95 | |3.4e-23 



Protein name 



Locus Name 



sp:YTRP_PSEPU 



Acc# 
P40604 



Description 

HYPOTHETICAL 52. 7 KB PROTEIN I N T RPE-TRPS IN T ER<jENIC R E GION PRECURSOR 



ORF Name 



23610636 c3 S8 



NTID 



AAID 



NT 
Length 

T73 



AA 

_ — Score 
Length 



] EEO 



7^" 



Probability 
B.7e-7S 



Protein name 



Description 



Locus Name 



|sp:YQCBJ15Em- 



Acc# 
P44197 



HYPOTHETICAL PROTEIN HI143b | 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


24644035_c2_45 


— | |182S | |3745 


215 


\660 | |255 | 


|8.4e 


-22 | 


Protein name 








Locus Name 




Acc# 


prooalDle citrate 


lyase beta cnain 






pir:T35052 


T35062 


Description 














ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


250251_i:2_i8 


| |1826 | |3746 


1 152 1 


P 


79 | |525 | 


|2.0e 


I 


Protein name 








Locus Name 




Acc# 



sp:PURS_HAEIN 



P43849 



Description 
(EC 4.1.1.21) (AIR CARBOXYLASE) (AlkC) 



ORF Name 



125485753 11 1 



Protein name 



Description 



NTID 



TF2T" 



AAID 



T7TT 



NT 
Length 
S3" 



AA 
Length 
[27TT 



— , — , Score Probability 



Locus Name 



Acc# 



MO-HIT 



447 



r\"0 ~W Mama XTPTT^ 7V 7\ T'H 


NT 
Length 


AA 

. — . , Score 
Length 


X? r ODdD 1 1 1 u y 


125510974 cl 34 1825 1 13748 
1 - - II 


1 185 1 
1 1 


l 55S 1 P 1 1 


|7.4e-30 1 
1 1 


Protein name 






Locus Name 


Acc# 








sp:YBEQ_ECOLI 


P77234 


Description 










HYPOTHETICAL 3 7.3 KD PROTEIN IN LEUS-GLTL INTERGENIC REGION 


ORF Name NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|29304668_t3_30 | |1829 3749 


| 302 


|909 | 556 | 


|l.le-53 | 


Protein name 






Locus Name 


Acc# 








sp : SYK3_ECOLI 


Description 








141 


HYPOTHETICAL LYSYL-TRNA SYNTHETASE HOMO LOG, (GX) 


ORF Name NTID AAID 


NT 
Length 


AA 

t — . i Score 
Length 


Probability 


|3127530i_il_li | |1830 ] |3750 


1 1 352 i 


|1059 | (516 | 


|1.8e-49 | 


Protein name 






Locus Name 


ACC# 








sp:PDXB_ECOLI 


P05459 


Description 










ERYTHRONATE - 4 - PHOS PHATE DEHYDROGENAS E , 


ORF Name NTID AAID 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


34179211__13_24 1831 |3751 


330 


|993 | 158 


1.2e-08 


Protein name 






Locus Name 


Acc# 


probable protein serine-threonine 




pir:C75297 


C75297 


phosphatase 


















Description 










ORF Name NTID AAID 


NT 
Length 


AA 

m — . , Score 
Length 


Probability 


36365625_t2_i7 1832 |3752 


147 


|444 | 348 | 


|i.2e-3i | 


Protein name 






Locus Name 


ACC# 


Hypothetical protein ;jhpl377 




pir:D71815 


D71815 



Description 



448 





\TTTD 
IN 1 ±U 




NT 
Length 


AA 

. — . , Score 
Length 


fz. (jjjcixji j. x uy 


15118952 cl 35 

1 — — 


1833 


13753 
1 


1 1 1 


|1212 | (703 | 


1 1 


Protein name 










Locus Name 


ACC# 












sp:PYR2_PSEAE 


Q51551 


Description 














CATALYTIC CHAIN) 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


5275300_r3_22 


1 1834 


1 1" 54 


| 347 


1044 | |954 


|7.1e-96 1 


Protein name 










Locus Name 


Acc# 












sp:BI0B_EC0LI 


P12996 


Description 














j BIOTIN SYNTHASE, 


(BIOTIN 


SYNTHETASE) 






i 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probabi 1 i ty 


5948342_t3J27 


1835 


3755 


259 


|7so | prr - 1 


|S.0e-5i 


Protein name 










Locus Name 


ACC# 












sp : PTTRK_RSEAE 


1 P72158 


Description 














| (AIR CARBOXYLASE) 


(AIRC) 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


|704i_ii_8 


|1836 


| 3756 


i m i 


P»* | |133 | 


|1.5e-10 | 


Protein name 










Locus Name 


ACC# 












sp:PURK_AQUAE 


066608 


Description 














(AIR CARBOXYLASE) 


(AIRC) 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


1299i392_r2_i8 


1837 


3757 


1 1 101 i 


|306 | |1S0 | 


9.7e-12 


Protein name 










Locus Name 


ACC# 


unknown 








gp:PDU08856 


| U08856 



Description 
Paracoccus denitriticans 



insertion sequence IS1248b, completesequence . 



449 



ORF Name 



NT ID 



AAID 



NT 



AA 



— , — , Score 



Length Length 



15110912 c3 69 



1838 



Protein name 



3 7 58 | | 35fl | [1077 | p^O | 

Locus Name 



sp : YQ JM_BACSU 



Probability 
|5 . 7e-58 

Acc# 
P54550 



Description 

PROBABLE NADH-DEPENDEN T FLAVIN OXTDg REDUCTgSE TgJFH 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


156327Bl_c2J>l 


| |1839 


| |3759 


1 P 1 


I 240 1 




Protein name 








Locus Name 


Acc# 


Description 












[NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . i Score 
Length 


Probability 


1567^ai6_t3_31 


| J1840 


| |3760 


1 I 472 1 


|1419 | (1772 | 


|1.5e-182 


Protein name 








Locus Name 


Acc# 



chain R:type I restriction enzyme, Hsd, chain 
R:type I restriction-modification system, 



pir : JCb^l6 



Description 

ORF Name 
|19S23B78_t3Jii) 
Protein name 



NTID 



11841 



AAID 



NT 
Length 
[405 



AA 
Length 
11218 



Score 



irnr 



Locus Name 



nypotnetical protein 



|pir:A7bS92 



Probability 

|1.0e-07 

Acc# 
A75592 



Description 



ORF Name 



|216420i>2 c2 70 



NTID 
|1842 



AAID 



— . , - — . , Score 



][ 



NT 
Length 

] EE 



] [ 



AA 
Length 



[585 | 



Probability 
|1.8e-67 



Protein name 

Description 
PUTATIVE NAT)(P)H NITROREDUCTASE, 



Locus Name 



| sp:YC7fl_HA Enr 



ACC# 

Q57431 :O05 
050 



450 



ORF Name 



121673201 n 25 



Protein name 



protein Tp70 



Description 



NT ID 



AAID 



NT 
Length 



— i v_ T — Score Probability 



AA 
Length 

|663 [ |364 | |2.4e-33 ~ 
Locus Name Acc# 



1 bir:A71309 



A71309:S18 
231:S19826 



ORF Name 
|218907B_c2_66 



NTID 



AAID 



NT AA 
— , — , Score 
Length Length 



Probability 



Protein name 



Description 



] [1844 | [3754 | |126 | |381 | |195 | |1.9e-lB 

Locus Name Acc# 



sp:YPR0_0WEFU 



P21260:P21 
261 



HYPOTHETICAL 


PROLINE-RlCH 


PROTEIN 


(FRAGMENT) 






1 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


22i43752_tl_9 


| 1845 


1 P ?6b 


1 l tJt 


i 1911 1 


|2755 


7 . 9e-287 



Protein name 



Locus Name 



type I site-specitic deoxynbonuc lease, Hsd 
chain R:type I restriction enzyme, Hsd, chain 
R:type I restriction-modification system, 



foir: JC5216 



Acc# 
JC5216 



Description 












ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


234375Sl_ri_4 | 


1846 


| |3766 


p 


252 ; 




Protein name 
Description 








Locus Name 


Acc# 


no-hit 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|23490937_c3_8i 


1847 


| 3767 


p 


282 |73 | 


|0.037 


Protein name 


Locus Name 


Acc# 



dehydr ogena s e 



gp:AP02583S 



AF025836 



Description 



Echinostoma sp.I. Atrica nicotinamide adenine dinucleotidedehydrogenase 
subunit 1 (ND1) gene, mitochondrial gene encodingmitochondrial protein, 
partial cds . 



451 



ORF Name 



NTID 



AAID 



24704462 c2 51 



NT AA 
Length Length 
1 \5ZI — 



Score 



Probability 
|3.8e-26 



Protein name 



Locus Name 



cinnamyl-alconol dehydrogenase 



] |gp:AF083333 



Acc# 
AF083333 



Description 

Medicago sativa cinnamyl-alconol dehydrogenase (ivisaCadi) mRNA, complete cds. 



ORF Name 
|32667715_tJ_^6' 

Protein name 



NTID 



AAID 



NT 



AA 



Length Length 



Score Probability 



"j [1849 | [3769 | [188 | |567 | |105 | [0,00093 

Locus Name Acc# 



nypotneticai protein TP0570 



pir :H71308 



H71308 



Description 

ORF Name 
|35350802_£l~r 
Protein name 



NTID AAID 



AA 

— Score 
Length Length 



TF5TT 



][ 



T77TT 



NT 

sn 

] EH 



] I 



Locus Name 



Probability 
|2.4e-31 

ACC# 



putative transposase 



p:AF007429 



Description 



EL 



AF007429 



1 Haemopnilus paragaiiinarum is-UKe putative 


transposase gene, complete cds. j 


ORF Name NTID AAID 


NT 
Length 


AA 
Length 


Score Probability 


|365S4842J:2_i9 • | JiflSi | |3771 | 


460 


p«3 


|573 | i.7e-55 | 



Protein name 



Locus Name 



type I site-specitic deoxynbonuc lease , Hsd 
chain S:type I restriction enzyme, Hsd, chain 
S:type I restriction-modification system, 



foir:JC5218 



Acc# 
JC5218 



Description 



ORF Name 



14032715 E3 27 



NTID 
] [1852 



AAID 



1 pm | ; 



NT AA 
Length Length 
TX5 1 — 



Score 



] E 



Protein name 

Description 
HYPOTHETICAL 14.4 KD PROTEIN V4SM 



Locus Name 



Probability 

|1.8e-17 

Acc# 



sp:V4SM_RHISN 



P50358 



452 



ORF Name NT ID 


AA1U 


NT 
Length 


, , Score 
Length 


Probability 


14111333 t3 33 1853 

1 " " 


3773 


201 


1603 1 1238 
1 1 1 


5.3e-20 


Protein name 






Locus Name 


Acc# 








sp:NAHR_PSEPT7 


P10183 


Description 










TRANSCRIPTIONAL ACTIVATOR 


PROTEIN NAHR 






ORF Name NT ID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


[4895127_cl_36 | |1854 


1 I 3774 1 


|116 


1351 1 1334 1 
1 III 


3.6e-30 | 


Protein name 






Locus Name 


Acc# 


0rt8 






gp:AB0ii413 


AB011413 


Description 










Streptomyces griseus genes 
and complete cds. 


tor Ort2, 


0rt3, 


urt4, urto, aesa, 


oris, partial 




UKt Name jmiijj 


AAID 


NT 
Length 


AA 

, , Score 
Length 


Probability 


|7034808_ci_49 | |1855 


1 P 775 1 


|70 


i p u i p i 


|0.033 


Protein name 






Locus Name 


Acc# 


hypothetical protein ZK856 


.5 




pir :T28044 


T28044 



Description 



ORF Name 



NT ID 



AAID 



NT 



AA 



Length Length 



Score Probability 



7080001 tl 2 



Protein name 



Description 



] [1856 | [3775 | |37i | [1116 | |238 | |2.3e-26 ~ 

Locus Name Acc# 



sp : YGCG_ECOLI 



P55140 



HYPOTHETICAL 34 


5 KD PROTEIN IN CYSJ-ENO INTERGENIC 


REGION 


(0313) 


ORF Name 


NT AA 

NTID AAID , . , m 

Length Length 


Score 


Probability 


7083578__Cl_37 


1857 3777 | 50 183 


152 


2.4e-ll | 



Protein name 



Locus Name 



NADP - dependent alcohol hydrogenase 



gp:LMFL1063 



Acc# 
AL121862 



Description 

Leishmania major Friedlin chromosome 23 cosmid L1063, complete cds . 



453 



ORF Name 



9782666 EE 7 



NT ID 
-] [1858 



AAID 



][ 



T77W 



NT AA 

— — Score 

Length Length 

1 11665 



] i 



Probability 
8.0e-252 



Protein name 



Locus Name 



ALXA and HSDM 



] |gp:PHU4678i 



Acc# 
U46781 



Description 



Pasteurella naemolytica putative coproporpnynnogen III oxidase (hemN' J gene, 
partial cds, leukotoxin transcriptional activator andrestriction modification 
methylase subunit (alxA-hsdM) , (hsdS) and(hsdR) genes, complete cds. 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


14257160_tl_2 


| |1859 


| |3779 


i p 94 i 


r 5 i 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


OR!? Name 


NT ID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|16180437_i:3J.8 


| |1860 


| p, 8 o 


1 P a 1 


|1509 | |1460| 


|1.7e-149 | 


Protein name 








Locus Name 


Acc# 



Description 



sp:GABD_ECOLI 



P25526 



, (SSDH) 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|19806552_tI_I 


| |1861 


| p781 


i r' 4 


pib | 


i" 1 1 


p.6e-07 | 



Protein name 



Locus Name 



probable ankynn 



pir:H71274 



Acc# 
H71274 



Description 



ORF Name 



24407502 t3 17 



NT ID 
■] [1862 



AAID 



][ 



TTWT 



NT 
Length 




Protein name 



AA 
Length 
|675 | |392 | 

Locus Name 



Score Probability 
2.5e-36 



glycine betame/ carnitine/ choline ABC 
transporter (membrane p) opuCD 

Description 



pir:F59570 



Acc# 
F69670 



454 



ORF Name 



NTID AAID 



Probability 



125900252 c2 30 



TEST 



] 



] f 



NT AA 

— — Score 

Length Length 

] [ 1254 | [138 | |1.3e-0B 



Protein name 



Locus Name 



putative natural resistance-associated ~~| |gp : CCA133735 



Acc# 
AJ133735 



Description 



Cyprinus carpio mRNA tor putative natural resistance-associateamacropnage 
protein (NRAMP) . 



ORF Name 



134094385 ci 23 



NTID 
] [1854 



AAID 



twt 



NT AA 
Length Length 
[TuT 



Score 



] EH3 



TIT 



T5T 



Probability 
16 . 3e-ll 



Protein name 



Locus Name 



AttJ 



|gp:U59485 



Description 



Acc# 

U59485:L63 
540 



Agrobacterium tumetaciens AtrC (atrC) gene, partial cds; AtrB(atrB), AtrA 
(atrA) , AttAl (attAl) , AttA2 (attA2) , AttB (attB),AttC (attC) , AttD (attD) , 
AttE (attE) , and AttF (attF) genes , complete cds; AttG (attG) gene, 
alternative splice products , complete cds; AttH (attH) , AttI (attl) , AttJ 
(attJ) , AttK (attK),AttL (attL) , AttM (attM) , AttO (attO) , AttP (attP) , AttR 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

L — th Score Probability 


|4770887_tl_3 


| |1S65 


|pm | 


|176 


|531 130 |2.7e-14 | 



Protein name 



Locus Name 



nypotnetical protein 



|gp:S3U18930 



Acc# 
"I Y18930 



Description 

Sultolobus soltataricus 2bi Kb genomic DNA tragment, strain P2 . 



ORF Name 



4875260 c2 33 



NTID 
] |1B66 



AAID 



NT AA 
Length Length 



Score Probability 



] E!Z] 



Protein name 
Description 

MO-HIT 



Locus Name 



Acc# 



455 



ORF Name 
|4884702__cI^ZT 



NT ID 



AAID 



T7FT 



NT 
Length 

] EED 



AA 



Protein name 



Score Probabi lity 

Length — -L 

| p^"| |i.4e-37 ~ 

Locus Name Acc# 



NonF 



gp:AF074603 



AF074603 



Description 



Streptomyces griseus subsp. griseus nonactin 
partial sequence. 



biosynthesis genecluster, 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


6740S92_i2_10 


1868 


1 I 3788 


165 


498 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


7218752J:3_15 


1869 


3789 


1 1 1 " 


|390 | |88 | 


|0.030 | 



Protein name 



Locus Name 



putative polysaccharide polymerase 



|gp:5PCP514E 



ACC# 
X85787 



Description 



S. pneumoniae cpsl4 locus. 


ORF Name NTID AAID . — . , 

Length 


AA 

. — . , Score 
Length 


Probability 


786305_rl_5 | 1870 3790 j 317 


|954 | 532 


9.4e-52 | 


Protein name 


Locus Name 


Acc# 


prooaole osmoprotection binding protein 


pir :G7189^ 


G71892 


Description 






NT 

ORF Name NTID AAID — , 

Length 


AA 

_ — . . Score 
Length 


Probability 


792090_r2_12 | |1871 | |3791 148 


I 447 1 




Protein name 


Locus Name 


Acc# 



Description 



-HIT 



456 





NTTD 

IN 1 1U 


AATD 


NT 
Length 


AA 










Length 




112273437 £3 58 
1 - - , 


1 11872 
1 1 


1 13792 
1 1 


J L 


1014 1 11680 I 
1 1 1 


|8.3e-173 

1 


Protein name 








Locus Name 


Acc# 










sp:SYGA_M0RCA 


P77892 


Description 












ALPHA CHAIN) 


(GLYRS) 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


|14181500_t2_36 


1873 


| |3793 


1 1 


2052 | 1593 | 


|1.4e-163 


Protein name 








Locus Name 


Acc# 



Sp : SYGB_HAEIN 



P43822 



Description 

I BETA CHAIN) (GLYRS) 



ORF Name 


NT 

NTID AAID — 

Lengtn 


AA 
Length 


Probability 


|195500<;2__cl_i0i 


1874 3794 |279 | 


840 |581 | 


|2.4e-56 | 


Protein name 




Locus Name 


ACC# 






sp:BTOC_KLEPN 


Q48436 


Description 








ACETOIN (DIACETYL) 


REDUCTASE, (ACETOIN DEHYDROGENASE ) (AR) 




ORF Name 


NT 

NTID AAID _ — . , 
Length 


AA 

. — . , Score 
Length 


Probability 


|21648382_ti_22 


1875 | 3795 | 279 


|840 | |8i3 


|S.2e-81 


Protein name 




Locus Name 


Acc# 



Description 
| (EC 6 .4 .T72T 



sp:ACCA_ECOLI 



P30867 



ORF Name 



121650017 c2 112 



NTID 
11876 



AAID 



NT 
n 
255 



AA 

— Score 
Length Length 



Protein name 



CO ED 

Locus Name 



Probability 
|4.3e-4i 



sp:LRTP_ECOLI 



Acc# 
P23885 



Description 

LEUCYL/PHENYIALANYL-TRNA- -PROTEIN TRANSFERASE, 



457 



UKr JName 


XTTTFl AATn 
IN 1 ±U f\I\±U 


NT 
Length 


AA 

. , Score 
Length 


c j. OijaiJi ± i t.y 


121657752 c3 147 
1 - - 


1 1877 1 13797 
1 1 1 


1 1 


I 11038 1 1580 1 

II II I 


|3.0e-56 

1 


Protein name 






Locus Name 


Acc# 








sp:YZ37_SYNY3 


Q55480 


Description 










| HYPOTHETICAL SUGAR 


KINASE SLR0537 








UKr JNcime 


IN 1 ±U I±t\±LJ 


NT 
Length 


AA 

k , score 
Length 


rLUDaDlllcy 


|219878il_r2_34 


I 11878 1 13798 

II II 


1 1237 
1 1 


I l 7 14 | |384 | 

II II 1 


|6.2e-41 
1 


Protein name 






Locus Name 


Acc# 








|sp:PGSA_HAEIN 


| P44528 


Description 










(EC 2.7.8.5) ( PHOSPHATIDYLGLYCERO PHOSPHATE 


SYNTHASE) (PGP SYNTHASE) 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


22038i32_t3_S7 


|1879 | |3799 


i « 


231 




Protein name 






Locus Name 


Acc# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


|2238462B_tl_5 


| |1880 | (3800 


| |44B 


! [1347 | J1005 1 


|2.8e-101 


Protein name 






Locus Name 


Acc# 








|sp:YKGC_ECOLI 


j P77212 


Description 










INTERGENIC REGION 


ORF Name 


NTID AAID 


NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


23493812_£2_33 


1881 | |3801 


1 p9S 


|2997 |972 | 


|6.2e-119 


Protein name 






Locus Name 


Acc# 


1 metalloprotease l 






gp:AF051243 


AF061243 



Description 



Homo sapiens metalloprotease l (MPl) mRNA, complete cds. 



458 



ORF Name 



NT ID AAID 



123875027 F2 50 



TF8T" 



][ 



NT AA 
Length Length 

] — I F*H 



Score 
1621 | 



Probability 
|i.4e-60 



Protein name 



Description 



Locus Name 



sp:RLUC_HAEIN 



ACC# 
P44433 



(P5 E UD0UR1DYLATE SYN T HASE) (URACIL HVDR0LYA5 E ) 



ORF Name 



124118802 c3 138 



Protein name 



NT ID 



AAID 



TTOT 



NT AA 

— — Score 

Length Length 

TZl 1 \TTZZ- 



[1742 | 



Probability 
|2.2e-179 



Locus Name 



serine nydroxymethyl transferase 



gp:AF073769 



Acc# 
AF073769 



Description 



Acmetobacter radioresistens serine nyaroxymetnyl transferase (glyA) gene, 
complete cds. 



ORF Name 



NT ID 



AAID 



124258777 c3 134 



TOUT- 



NT 
Length 

] F^ 1 - 



— , . — x , Score 



AA 
Length 



1 1 



Probability 
1.5e-lS0 



Protein name 



Locus Name 



ribonuclease E, :cei± snape -determining 
protein: message stability-altering 
protein :RNase E 



tpir:S273il 



Description 

ORF Name 
[24300257J:i_li 



Acc# 

A64852 :S45 
572:S27311 
:A23747: JG 



NT ID 



AAID 



(T5W 



13805 



NT 
Length 




AA 

T — ^ Score 
Length 



Protein name 



ED ED 

Locus Name 



Probability 
3.8e-10 



conserved nypotnetical protein 



p±r :B72287 



Acc# 
B72287 



Description 



ORF Name 



125557776 cl 102 



NTID AAID 
1 885 I 



NT 
Length 
TT5 



AA 

_ — . . Score 
Length 



Protein name 

Description 
PHAGE SHOCK PROTEIN E PRECURSOR 



|450 | |180 | 

Locus Name 



Probability 
7.4e-14 



sp:PSPE_ECOLI 



Acc# 
P23857 



459 



ORF Name 



NT ID AAID 



NT AA 
„ — ^n. „ — ^ Score Probability 
Length Length *- 



125555585 c3 149 



1887 




3807 




348 




1047 




591 





5.2e-68 



Protein name 



Locus Name 



nypotnetical protein sirovav 



|pir:S7700i 



Acc# 
S77001 



Description 



ORF Name 



26754011 cl 85 



NTID 
"] [1888 



AAID 



NT AA 
— — Score 
Length Length 



Probability 



Protein name 



2 |357 | [1074 | [1755 | |8.1e-182 
Locus Name 



NAD repressor/NMN transporter NaaRp 



|gp:MCU733ZT" 



Acc# 
U73324 



Description 



Moraxella catarrhalis glycyl-tRNA synthetase beta subunit (GlyRS)and NAD 
repressor/NMN transporter NadRp (NadR) genes, partial cds,and glycyl-tRNA 
synthetase alpha subunit (GlyRS) gene, completecds. 



ORF Name 
[2845537_c3_137 



AAID 



NTID 

] [1889 | prre~ 



NT AA 
Length Length 
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Protein name 
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sp:UBIC_ECOLI 
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Protein name 



Locus Name 



exopolypnosphatase 



Description 



E 



AF0S3453 



Acc# 
AF053463 



Pseudomonas aeruginosa tnioredoxin 
complete cds . 



Itrx) and exopolyphosphatase (ppx) genes , 
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Probability 
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Protein name 



Locus Name 



Hypothetical protein 



E 



PPPC2 



Acc# 
Y11998 



Description 
P. tluorescens FC2.1, FC2.2, FC2.3C, 



FC2.4 and FC2.Sc open readingtrames . 
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NTID AAID 
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Length Length 
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Protein name 



Description 



Locus Name 



jsp : PNUC_ECOLI 



Acc# 

P31215:P77 
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PNUC PROTEIN 
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NTID 
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Length 
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Length 


Probability 
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| |3813 
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1227 931 | 


8.5e-100 


Protein name 








Locus Name 


Acc# 



sp:YHIN_ECOLI 



Description 

HYPOTHETICAL 43.8 KD PROTEIN IN RHSB-PIT INTERGENIC REGION 
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Length Length 
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Score 
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Protein name 



Locus Name 



translation elongation tactor eEF-l alpha 
chain PIK-A49 iphosphatidylinositol 4-kinase 
activator PIK-A49 



pir:A4b32b 
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Locus Name 
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ACC# 

A45325:B45 
325:045325 
:D45325 :E4 



Probability 
1.9e-47 

Acc# 
P46847 



HYPOTHETICAL 21.0 


KD PROTEIN IN BIOH-GNTT INTERGENIC REGION 
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Probability 
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Protein name 


Locus Name 


Acc# 



sp:YEMU_COXBU 



P45680 



Description 

HYPOTHETICAL lb. 8 KD PROTEIN I N ETOPEPHH INTER GENIC REGION 
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1897 | |3817 


i i bjt i 
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Protein name 






Locus Name 


Acc# 


isocitrate lyase 






gp:AB004651 


AB004651 


Description 


Hypnomicrofcium metnyiovorum gene tor isocitrate lyase , morgamcpnospnate 
transporter methionine synthase , complete and partial cds. 
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Protein name 
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conserved nypotnetical protein yerL 
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Protein name 
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sp:AMID_MORC'A 



Q49091 



Description 
PUTATIVE AMlDASsK , 
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Protein name 

Description 
CELL DIVISION INHIBITOR M1NC 
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Protein name 
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Protein name 
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CYTOCHROME 
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Protein name 

Description 
HYPOTHETICAL 36.3 



Locus Name 



sp:Y!HG_ECOLl 



Acc# 
P32129 



KB PRO T EIN IN DSHA-POLA IN T ERG E N I C R E GION 
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Protein name 



j [ 190 7 | [3827 | |231 | \G9G | [389 | |5.3e-36 

Locus Name 



outer membrane protein nomolog 



gp:AF067083 



ACC# 
AF067083 



Description 



Vitreosciila sp. outer membrane protein nomolog gene, complete cds/Trp 
repressor binding protein gene, partial cds; and unknown genes. 
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M. catarrnalis 


JDla gene. 
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gag protein 








gp:MUSERVGG2 


M26006 



Description 



Mouse endogenous retrovirus truncated gag gene, complete cds, 
15.3. 



clonedel env-2 
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OL1GOPEPTIDASE A, 
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Protein name 
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sp:YBHP_ECOLI 



P75772 



Description 

HYPOTHETICAL 28.8 KD PROTEIN IN MOAE-RHLE IMTERGEMIC REGION 
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Locus Name 



sp:YHEV_ECOLI 



Acc# 
P56622 



Description 

HYPOTHETICAL 7.6 KD PROTEIN IN 3LYD-KEFB INTERGENIC REGION 
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prolyl oiigopeptidase, precursor 



1917 | [ 3837 | |923 | [2772 | [T7B~ | |2.2e-100 

Locus Name Acc# 

A38086 



pir :A38086 



Description 



ORF Name 



3907b68 c2 7% 



NTID 
[1918 



AAID 



NT AA 
— , — , Score 
Length Length 



Probability 



]EE!Z1 1 



simr 



TIT 



Protein name 
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AF162221 



Description 

I Xestia c -nigrum granulovirus genome, complete sequence. 
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vol t age - dependent 


anion ciiannel protein lt> 
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Description 


Zea mays voitage- 
cds; nuclear gene 


dependent anion cnannel protein id ^vaaclD) 
for mitochondrial product. 


itlrna, complete 
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